Computer Vision¶

PART A - 20 Marks¶

  • DOMAIN: Entertainment

  • CONTEXT: Company X owns a movie application and repository which caters movie streaming to millions of users who on subscription basis. Company wants to automate the process of cast and crew information in each scene from a movie such that when a user pauses on the movie and clicks on cast information button, the app will show details of the actor in the scene. Company has an in-house computer vision and multimedia experts who need to detect faces from screen shots from the movie scene. The data labelling is already done.

  • DATA DESCRIPTION: The dataset comprises of images and its mask for corresponding human face.

  • PROJECT OBJECTIVE: To build a face detection system.

Steps and tasks: [ Total Score: 20 Marks]¶

1. Import and Understand the data [7 Marks]¶

  1. Import and read ‘images.npy’. [1 Marks]
  2. Split the data into Features(X) & labels(Y). Unify shape of all the images. [3 Marks]
    Imp Note: Replace all the pixels within masked area with 1.
    Hint: X will comprise of array of image whereas Y will comprise of coordinates of the mask(human face). Observe: data[0], data[0][0], data[0][1].
  3. Split the data into train and test[400:9]. [1 Marks]
  4. Select random image from the train data and display original image and masked image. [2 Marks]
Q. 1.A. Import and read ‘images.npy’.¶
In [ ]:
#Import required libraries
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import cv2
from IPython.display import clear_output
import zipfile


import tensorflow as tf


import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
from sklearn.svm import SVC

from tqdm import tqdm

import warnings
WARNING:tensorflow:From C:\Users\tuhin.sengupta\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\keras\src\losses.py:2976: The name tf.losses.sparse_softmax_cross_entropy is deprecated. Please use tf.compat.v1.losses.sparse_softmax_cross_entropy instead.

In [ ]:
RunningInCOLAB = 'google.colab' in str(get_ipython()) if hasattr(__builtins__,'__IPYTHON__') else False

if RunningInCOLAB:
    from google.colab import drive
    drive.mount('/content/drive')
In [ ]:
# Load images and labels from the .npy file

if RunningInCOLAB:
    file_path = '/content/drive/MyDrive/Tuhin/AI-ML Course - UT Austin/Projects/8-Computer Vision/images.npy' # Google Drive path
else:
    file_path = 'images.npy' # Local path
data = np.load(file_path, allow_pickle=True)
Q. 1.B. Split the data into Features(X) & labels(Y). Unify shape of all the images.¶
In [ ]:
#First standardize the image shape

#We use MobileNetV2 for transfer learning. This model expects the input image to be of shape (224,224,3)
image_height = 224
image_width  = 224
channels     = 3

#Create X and Y sets
X = np.zeros((int(data.shape[0]),image_height, image_width, 3)) #Contains the original image (reshaped)
Y = np.zeros((int(data.shape[0]), image_height, image_width)) #Contains masks corresponding to the face co-ordinates

#Now populate the X and Y sets
no_of_images = len(data)
#Loop through the data to extract the face region and replace the pixel values with 1
for i in range(no_of_images):
    img = data[i][0]  #Load original image array
    img = cv2.resize(img, dsize=(image_height, image_width), interpolation=cv2.INTER_CUBIC) #Resize the image to 224x224

    #We will use 3 channels only, so let's try to discard the alpha channel if it exists
    #But if any image has only 1 channel (grayscale), then we will convert it to 3 channels
    try:
        #Discard the alpha channel if it exists
        img = img[:,:,:3]
    except:
        #Convert the grayscale image to color so that the number of channels are standardized to 3
        print(f"Found image {i} as Grayscale image, changing it 3 channel color image.")
        # convert the grayscale image to color so that the number of channels are standardized to 3
        img = cv2.cvtColor(img, cv2.COLOR_GRAY2RGB)
        continue
    #Now populate the X and Y sets
    X[i] = np.array(img, dtype=np.float32)
    for mask in data[i][1]:
        if 'Face' in mask['label']:
            x1=int(mask['points'][0]['x'] * image_width)
            y1=int(mask['points'][0]['y'] * image_height)
            x2=int(mask['points'][1]['x'] * image_width)
            y2=int(mask['points'][1]['y'] * image_height)
            Y[i][y1:y2, x1:x2] = 1 # set all pixels within the mask co-ordinates to 1.

print(f"X and Y populated, shape of X is '{X.shape}' and the shape of Y is '{Y.shape}' ")
Found image 272 as Grayscale image, changing it 3 channel color image.
X and Y populated, shape of X is '(409, 224, 224, 3)' and the shape of Y is '(409, 224, 224)' 
Q. 1.C. Split the data into train and test[400:9].¶
In [ ]:
#Split X and Y in train in test sets with 400:9 ratio
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.0225, random_state=42)
Q. 1.D. Select random image from the train data and display original image and masked image.¶
In [ ]:
def show_image(index):
    fig, axs = plt.subplots(1, 3, figsize=(20, 10))
    axs[0].imshow((X_train[index]/255).astype(np.float32))
    axs[0].set_title("Original Image")
    axs[0].axis('off')
    axs[1].imshow(Y_train[index])
    axs[1].set_title("Masked Area where Face is found")
    axs[1].axis('off')
    #Draw a countour around detected face in test_image from the predicted masked area
    contours, _ = cv2.findContours(Y_train[index].astype(np.uint8), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
    contoured_image = np.copy(X_train[index])
    cv2.drawContours(contoured_image, contours, -1, (0, 255, 0), 1)
    axs[2].imshow((contoured_image/255).astype(np.float32))
    axs[2].set_title("Labeled Image(Colored Contour around Face)")
    axs[2].axis('off')
    plt.show()

#Select 4 random images from train data and display original image and masked image
indexes = np.random.randint(0, X_train.shape[0], size=4)
for index in indexes:
    show_image(index)
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image

2. Model building [11 Marks]¶

  1. Design a face mask detection model. [4 Marks]
    Hint: 1. Use MobileNet architecture for initial pre-trained non-trainable layers.
    Hint: 2. Add appropriate Upsampling layers to imitate U-net architecture.
  2. Design your own Dice Coefficient and Loss function. [2 Marks]
  3. Train and tune the model as required. [3 Marks]
  4. Evaluate and share insights on performance of the model. [2 Marks]
Q. 2.A. Design a face mask detection model¶
In [ ]:
# Set some hyper parameters
epochs = 500 #@param {type:"integer"}
batch_size = 8 #@param {type:"integer"}
learning_rate = 1e-4 #@param {type:"number"}
In [ ]:
# Define the model
def model():

    #Define the model

    #We will use MobileNetV2 for transfer learning. This model expects the input image to be of shape (224,224,3)
    #Input Image Layer
    input = tf.keras.layers.Input([image_height, image_width, 3], dtype = tf.uint8, name="original_input_image")

    #Preprocess the input image
    x = tf.cast(input, tf.float32)

    input_image_name = x.name.split('/')[0]

    x = tf.keras.applications.mobilenet.preprocess_input(x)

    #Load the MobileNetV2 model with the preprocessed input images
    encoder = tf.keras.applications.MobileNetV2(input_tensor=x, input_shape=(image_height,image_width, 3), weights="imagenet", include_top=False, alpha=0.35)

    #make encoder layer (including all sub-layers in it) non-trainable
    encoder.trainable = False

    encoder_output = encoder.get_layer("block_13_expand_relu").output

    skip_connection_names = [input_image_name, "block_1_expand_relu", "block_3_expand_relu", "block_6_expand_relu"]

    #Decoder

    #No of convolution filter during encoders, decoder will have it in reverse order
    f = [16, 32, 48, 64]

    #The bwlow 'x' will be the input to out decoder
    x = encoder_output

    # There are four repeatative layers:
    ### each layer will have 3 following sub-layers:
    #### 1. Upsampling (doubleing the dimention) and concat it with out of corresponding encoder layer (in the order: 'input image', 'block 1 relu', 'block 3 relu' and 'block 6 relu' )
    #### 2. Convulution with 3x3 kernel , batch normalization with relu activation (no of filters will be decreasing in each layer in this order: 64, 48, 32, 16)
    #### 3. Another Convulution with 3x3 kernel , batch normalization with relu activation (no of filters will be decreasing in each layer in this order: 64, 48, 32, 16)
    for i in range(1, len(skip_connection_names)+1, 1):

        #Sub-layer 1
        x_skip = encoder.get_layer(skip_connection_names[-i]).output
        x = tf.keras.layers.UpSampling2D((2, 2))(x)
        x = tf.keras.layers.Concatenate()([x, x_skip])

        #Sub-layer 2
        x = tf.keras.layers.Conv2D(f[-i], (3, 3), padding="same")(x)
        x = tf.keras.layers.BatchNormalization()(x)
        x = tf.keras.layers.Activation("relu")(x)

        #Sub-layer 3
        x = tf.keras.layers.Conv2D(f[-i], (3, 3), padding="same")(x)
        x = tf.keras.layers.BatchNormalization()(x)
        x = tf.keras.layers.Activation("relu")(x)

    #Output with sigmoid activation - 0 or 1 output
    x = tf.keras.layers.Conv2D(1, (1, 1), padding="same")(x)
    output = tf.keras.layers.Activation("sigmoid")(x)


    model = tf.keras.models.Model(inputs=[input], outputs=[output])
    return model
In [ ]:
#Create the model
model = model()
model.summary()
WARNING:tensorflow:From C:\Users\tuhin.sengupta\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\keras\src\backend.py:1398: The name tf.executing_eagerly_outside_functions is deprecated. Please use tf.compat.v1.executing_eagerly_outside_functions instead.

WARNING:tensorflow:From C:\Users\tuhin.sengupta\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\keras\src\layers\normalization\batch_normalization.py:979: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead.

Model: "model"
__________________________________________________________________________________________________
 Layer (type)                Output Shape                 Param #   Connected to                  
==================================================================================================
 original_input_image (Inpu  [(None, 224, 224, 3)]        0         []                            
 tLayer)                                                                                          
                                                                                                  
 tf.cast (TFOpLambda)        (None, 224, 224, 3)          0         ['original_input_image[0][0]']
                                                                                                  
 tf.math.truediv (TFOpLambd  (None, 224, 224, 3)          0         ['tf.cast[0][0]']             
 a)                                                                                               
                                                                                                  
 tf.math.subtract (TFOpLamb  (None, 224, 224, 3)          0         ['tf.math.truediv[0][0]']     
 da)                                                                                              
                                                                                                  
 Conv1 (Conv2D)              (None, 112, 112, 16)         432       ['tf.math.subtract[0][0]']    
                                                                                                  
 bn_Conv1 (BatchNormalizati  (None, 112, 112, 16)         64        ['Conv1[0][0]']               
 on)                                                                                              
                                                                                                  
 Conv1_relu (ReLU)           (None, 112, 112, 16)         0         ['bn_Conv1[0][0]']            
                                                                                                  
 expanded_conv_depthwise (D  (None, 112, 112, 16)         144       ['Conv1_relu[0][0]']          
 epthwiseConv2D)                                                                                  
                                                                                                  
 expanded_conv_depthwise_BN  (None, 112, 112, 16)         64        ['expanded_conv_depthwise[0][0
  (BatchNormalization)                                              ]']                           
                                                                                                  
 expanded_conv_depthwise_re  (None, 112, 112, 16)         0         ['expanded_conv_depthwise_BN[0
 lu (ReLU)                                                          ][0]']                        
                                                                                                  
 expanded_conv_project (Con  (None, 112, 112, 8)          128       ['expanded_conv_depthwise_relu
 v2D)                                                               [0][0]']                      
                                                                                                  
 expanded_conv_project_BN (  (None, 112, 112, 8)          32        ['expanded_conv_project[0][0]'
 BatchNormalization)                                                ]                             
                                                                                                  
 block_1_expand (Conv2D)     (None, 112, 112, 48)         384       ['expanded_conv_project_BN[0][
                                                                    0]']                          
                                                                                                  
 block_1_expand_BN (BatchNo  (None, 112, 112, 48)         192       ['block_1_expand[0][0]']      
 rmalization)                                                                                     
                                                                                                  
 block_1_expand_relu (ReLU)  (None, 112, 112, 48)         0         ['block_1_expand_BN[0][0]']   
                                                                                                  
 block_1_pad (ZeroPadding2D  (None, 113, 113, 48)         0         ['block_1_expand_relu[0][0]'] 
 )                                                                                                
                                                                                                  
 block_1_depthwise (Depthwi  (None, 56, 56, 48)           432       ['block_1_pad[0][0]']         
 seConv2D)                                                                                        
                                                                                                  
 block_1_depthwise_BN (Batc  (None, 56, 56, 48)           192       ['block_1_depthwise[0][0]']   
 hNormalization)                                                                                  
                                                                                                  
 block_1_depthwise_relu (Re  (None, 56, 56, 48)           0         ['block_1_depthwise_BN[0][0]']
 LU)                                                                                              
                                                                                                  
 block_1_project (Conv2D)    (None, 56, 56, 8)            384       ['block_1_depthwise_relu[0][0]
                                                                    ']                            
                                                                                                  
 block_1_project_BN (BatchN  (None, 56, 56, 8)            32        ['block_1_project[0][0]']     
 ormalization)                                                                                    
                                                                                                  
 block_2_expand (Conv2D)     (None, 56, 56, 48)           384       ['block_1_project_BN[0][0]']  
                                                                                                  
 block_2_expand_BN (BatchNo  (None, 56, 56, 48)           192       ['block_2_expand[0][0]']      
 rmalization)                                                                                     
                                                                                                  
 block_2_expand_relu (ReLU)  (None, 56, 56, 48)           0         ['block_2_expand_BN[0][0]']   
                                                                                                  
 block_2_depthwise (Depthwi  (None, 56, 56, 48)           432       ['block_2_expand_relu[0][0]'] 
 seConv2D)                                                                                        
                                                                                                  
 block_2_depthwise_BN (Batc  (None, 56, 56, 48)           192       ['block_2_depthwise[0][0]']   
 hNormalization)                                                                                  
                                                                                                  
 block_2_depthwise_relu (Re  (None, 56, 56, 48)           0         ['block_2_depthwise_BN[0][0]']
 LU)                                                                                              
                                                                                                  
 block_2_project (Conv2D)    (None, 56, 56, 8)            384       ['block_2_depthwise_relu[0][0]
                                                                    ']                            
                                                                                                  
 block_2_project_BN (BatchN  (None, 56, 56, 8)            32        ['block_2_project[0][0]']     
 ormalization)                                                                                    
                                                                                                  
 block_2_add (Add)           (None, 56, 56, 8)            0         ['block_1_project_BN[0][0]',  
                                                                     'block_2_project_BN[0][0]']  
                                                                                                  
 block_3_expand (Conv2D)     (None, 56, 56, 48)           384       ['block_2_add[0][0]']         
                                                                                                  
 block_3_expand_BN (BatchNo  (None, 56, 56, 48)           192       ['block_3_expand[0][0]']      
 rmalization)                                                                                     
                                                                                                  
 block_3_expand_relu (ReLU)  (None, 56, 56, 48)           0         ['block_3_expand_BN[0][0]']   
                                                                                                  
 block_3_pad (ZeroPadding2D  (None, 57, 57, 48)           0         ['block_3_expand_relu[0][0]'] 
 )                                                                                                
                                                                                                  
 block_3_depthwise (Depthwi  (None, 28, 28, 48)           432       ['block_3_pad[0][0]']         
 seConv2D)                                                                                        
                                                                                                  
 block_3_depthwise_BN (Batc  (None, 28, 28, 48)           192       ['block_3_depthwise[0][0]']   
 hNormalization)                                                                                  
                                                                                                  
 block_3_depthwise_relu (Re  (None, 28, 28, 48)           0         ['block_3_depthwise_BN[0][0]']
 LU)                                                                                              
                                                                                                  
 block_3_project (Conv2D)    (None, 28, 28, 16)           768       ['block_3_depthwise_relu[0][0]
                                                                    ']                            
                                                                                                  
 block_3_project_BN (BatchN  (None, 28, 28, 16)           64        ['block_3_project[0][0]']     
 ormalization)                                                                                    
                                                                                                  
 block_4_expand (Conv2D)     (None, 28, 28, 96)           1536      ['block_3_project_BN[0][0]']  
                                                                                                  
 block_4_expand_BN (BatchNo  (None, 28, 28, 96)           384       ['block_4_expand[0][0]']      
 rmalization)                                                                                     
                                                                                                  
 block_4_expand_relu (ReLU)  (None, 28, 28, 96)           0         ['block_4_expand_BN[0][0]']   
                                                                                                  
 block_4_depthwise (Depthwi  (None, 28, 28, 96)           864       ['block_4_expand_relu[0][0]'] 
 seConv2D)                                                                                        
                                                                                                  
 block_4_depthwise_BN (Batc  (None, 28, 28, 96)           384       ['block_4_depthwise[0][0]']   
 hNormalization)                                                                                  
                                                                                                  
 block_4_depthwise_relu (Re  (None, 28, 28, 96)           0         ['block_4_depthwise_BN[0][0]']
 LU)                                                                                              
                                                                                                  
 block_4_project (Conv2D)    (None, 28, 28, 16)           1536      ['block_4_depthwise_relu[0][0]
                                                                    ']                            
                                                                                                  
 block_4_project_BN (BatchN  (None, 28, 28, 16)           64        ['block_4_project[0][0]']     
 ormalization)                                                                                    
                                                                                                  
 block_4_add (Add)           (None, 28, 28, 16)           0         ['block_3_project_BN[0][0]',  
                                                                     'block_4_project_BN[0][0]']  
                                                                                                  
 block_5_expand (Conv2D)     (None, 28, 28, 96)           1536      ['block_4_add[0][0]']         
                                                                                                  
 block_5_expand_BN (BatchNo  (None, 28, 28, 96)           384       ['block_5_expand[0][0]']      
 rmalization)                                                                                     
                                                                                                  
 block_5_expand_relu (ReLU)  (None, 28, 28, 96)           0         ['block_5_expand_BN[0][0]']   
                                                                                                  
 block_5_depthwise (Depthwi  (None, 28, 28, 96)           864       ['block_5_expand_relu[0][0]'] 
 seConv2D)                                                                                        
                                                                                                  
 block_5_depthwise_BN (Batc  (None, 28, 28, 96)           384       ['block_5_depthwise[0][0]']   
 hNormalization)                                                                                  
                                                                                                  
 block_5_depthwise_relu (Re  (None, 28, 28, 96)           0         ['block_5_depthwise_BN[0][0]']
 LU)                                                                                              
                                                                                                  
 block_5_project (Conv2D)    (None, 28, 28, 16)           1536      ['block_5_depthwise_relu[0][0]
                                                                    ']                            
                                                                                                  
 block_5_project_BN (BatchN  (None, 28, 28, 16)           64        ['block_5_project[0][0]']     
 ormalization)                                                                                    
                                                                                                  
 block_5_add (Add)           (None, 28, 28, 16)           0         ['block_4_add[0][0]',         
                                                                     'block_5_project_BN[0][0]']  
                                                                                                  
 block_6_expand (Conv2D)     (None, 28, 28, 96)           1536      ['block_5_add[0][0]']         
                                                                                                  
 block_6_expand_BN (BatchNo  (None, 28, 28, 96)           384       ['block_6_expand[0][0]']      
 rmalization)                                                                                     
                                                                                                  
 block_6_expand_relu (ReLU)  (None, 28, 28, 96)           0         ['block_6_expand_BN[0][0]']   
                                                                                                  
 block_6_pad (ZeroPadding2D  (None, 29, 29, 96)           0         ['block_6_expand_relu[0][0]'] 
 )                                                                                                
                                                                                                  
 block_6_depthwise (Depthwi  (None, 14, 14, 96)           864       ['block_6_pad[0][0]']         
 seConv2D)                                                                                        
                                                                                                  
 block_6_depthwise_BN (Batc  (None, 14, 14, 96)           384       ['block_6_depthwise[0][0]']   
 hNormalization)                                                                                  
                                                                                                  
 block_6_depthwise_relu (Re  (None, 14, 14, 96)           0         ['block_6_depthwise_BN[0][0]']
 LU)                                                                                              
                                                                                                  
 block_6_project (Conv2D)    (None, 14, 14, 24)           2304      ['block_6_depthwise_relu[0][0]
                                                                    ']                            
                                                                                                  
 block_6_project_BN (BatchN  (None, 14, 14, 24)           96        ['block_6_project[0][0]']     
 ormalization)                                                                                    
                                                                                                  
 block_7_expand (Conv2D)     (None, 14, 14, 144)          3456      ['block_6_project_BN[0][0]']  
                                                                                                  
 block_7_expand_BN (BatchNo  (None, 14, 14, 144)          576       ['block_7_expand[0][0]']      
 rmalization)                                                                                     
                                                                                                  
 block_7_expand_relu (ReLU)  (None, 14, 14, 144)          0         ['block_7_expand_BN[0][0]']   
                                                                                                  
 block_7_depthwise (Depthwi  (None, 14, 14, 144)          1296      ['block_7_expand_relu[0][0]'] 
 seConv2D)                                                                                        
                                                                                                  
 block_7_depthwise_BN (Batc  (None, 14, 14, 144)          576       ['block_7_depthwise[0][0]']   
 hNormalization)                                                                                  
                                                                                                  
 block_7_depthwise_relu (Re  (None, 14, 14, 144)          0         ['block_7_depthwise_BN[0][0]']
 LU)                                                                                              
                                                                                                  
 block_7_project (Conv2D)    (None, 14, 14, 24)           3456      ['block_7_depthwise_relu[0][0]
                                                                    ']                            
                                                                                                  
 block_7_project_BN (BatchN  (None, 14, 14, 24)           96        ['block_7_project[0][0]']     
 ormalization)                                                                                    
                                                                                                  
 block_7_add (Add)           (None, 14, 14, 24)           0         ['block_6_project_BN[0][0]',  
                                                                     'block_7_project_BN[0][0]']  
                                                                                                  
 block_8_expand (Conv2D)     (None, 14, 14, 144)          3456      ['block_7_add[0][0]']         
                                                                                                  
 block_8_expand_BN (BatchNo  (None, 14, 14, 144)          576       ['block_8_expand[0][0]']      
 rmalization)                                                                                     
                                                                                                  
 block_8_expand_relu (ReLU)  (None, 14, 14, 144)          0         ['block_8_expand_BN[0][0]']   
                                                                                                  
 block_8_depthwise (Depthwi  (None, 14, 14, 144)          1296      ['block_8_expand_relu[0][0]'] 
 seConv2D)                                                                                        
                                                                                                  
 block_8_depthwise_BN (Batc  (None, 14, 14, 144)          576       ['block_8_depthwise[0][0]']   
 hNormalization)                                                                                  
                                                                                                  
 block_8_depthwise_relu (Re  (None, 14, 14, 144)          0         ['block_8_depthwise_BN[0][0]']
 LU)                                                                                              
                                                                                                  
 block_8_project (Conv2D)    (None, 14, 14, 24)           3456      ['block_8_depthwise_relu[0][0]
                                                                    ']                            
                                                                                                  
 block_8_project_BN (BatchN  (None, 14, 14, 24)           96        ['block_8_project[0][0]']     
 ormalization)                                                                                    
                                                                                                  
 block_8_add (Add)           (None, 14, 14, 24)           0         ['block_7_add[0][0]',         
                                                                     'block_8_project_BN[0][0]']  
                                                                                                  
 block_9_expand (Conv2D)     (None, 14, 14, 144)          3456      ['block_8_add[0][0]']         
                                                                                                  
 block_9_expand_BN (BatchNo  (None, 14, 14, 144)          576       ['block_9_expand[0][0]']      
 rmalization)                                                                                     
                                                                                                  
 block_9_expand_relu (ReLU)  (None, 14, 14, 144)          0         ['block_9_expand_BN[0][0]']   
                                                                                                  
 block_9_depthwise (Depthwi  (None, 14, 14, 144)          1296      ['block_9_expand_relu[0][0]'] 
 seConv2D)                                                                                        
                                                                                                  
 block_9_depthwise_BN (Batc  (None, 14, 14, 144)          576       ['block_9_depthwise[0][0]']   
 hNormalization)                                                                                  
                                                                                                  
 block_9_depthwise_relu (Re  (None, 14, 14, 144)          0         ['block_9_depthwise_BN[0][0]']
 LU)                                                                                              
                                                                                                  
 block_9_project (Conv2D)    (None, 14, 14, 24)           3456      ['block_9_depthwise_relu[0][0]
                                                                    ']                            
                                                                                                  
 block_9_project_BN (BatchN  (None, 14, 14, 24)           96        ['block_9_project[0][0]']     
 ormalization)                                                                                    
                                                                                                  
 block_9_add (Add)           (None, 14, 14, 24)           0         ['block_8_add[0][0]',         
                                                                     'block_9_project_BN[0][0]']  
                                                                                                  
 block_10_expand (Conv2D)    (None, 14, 14, 144)          3456      ['block_9_add[0][0]']         
                                                                                                  
 block_10_expand_BN (BatchN  (None, 14, 14, 144)          576       ['block_10_expand[0][0]']     
 ormalization)                                                                                    
                                                                                                  
 block_10_expand_relu (ReLU  (None, 14, 14, 144)          0         ['block_10_expand_BN[0][0]']  
 )                                                                                                
                                                                                                  
 block_10_depthwise (Depthw  (None, 14, 14, 144)          1296      ['block_10_expand_relu[0][0]']
 iseConv2D)                                                                                       
                                                                                                  
 block_10_depthwise_BN (Bat  (None, 14, 14, 144)          576       ['block_10_depthwise[0][0]']  
 chNormalization)                                                                                 
                                                                                                  
 block_10_depthwise_relu (R  (None, 14, 14, 144)          0         ['block_10_depthwise_BN[0][0]'
 eLU)                                                               ]                             
                                                                                                  
 block_10_project (Conv2D)   (None, 14, 14, 32)           4608      ['block_10_depthwise_relu[0][0
                                                                    ]']                           
                                                                                                  
 block_10_project_BN (Batch  (None, 14, 14, 32)           128       ['block_10_project[0][0]']    
 Normalization)                                                                                   
                                                                                                  
 block_11_expand (Conv2D)    (None, 14, 14, 192)          6144      ['block_10_project_BN[0][0]'] 
                                                                                                  
 block_11_expand_BN (BatchN  (None, 14, 14, 192)          768       ['block_11_expand[0][0]']     
 ormalization)                                                                                    
                                                                                                  
 block_11_expand_relu (ReLU  (None, 14, 14, 192)          0         ['block_11_expand_BN[0][0]']  
 )                                                                                                
                                                                                                  
 block_11_depthwise (Depthw  (None, 14, 14, 192)          1728      ['block_11_expand_relu[0][0]']
 iseConv2D)                                                                                       
                                                                                                  
 block_11_depthwise_BN (Bat  (None, 14, 14, 192)          768       ['block_11_depthwise[0][0]']  
 chNormalization)                                                                                 
                                                                                                  
 block_11_depthwise_relu (R  (None, 14, 14, 192)          0         ['block_11_depthwise_BN[0][0]'
 eLU)                                                               ]                             
                                                                                                  
 block_11_project (Conv2D)   (None, 14, 14, 32)           6144      ['block_11_depthwise_relu[0][0
                                                                    ]']                           
                                                                                                  
 block_11_project_BN (Batch  (None, 14, 14, 32)           128       ['block_11_project[0][0]']    
 Normalization)                                                                                   
                                                                                                  
 block_11_add (Add)          (None, 14, 14, 32)           0         ['block_10_project_BN[0][0]', 
                                                                     'block_11_project_BN[0][0]'] 
                                                                                                  
 block_12_expand (Conv2D)    (None, 14, 14, 192)          6144      ['block_11_add[0][0]']        
                                                                                                  
 block_12_expand_BN (BatchN  (None, 14, 14, 192)          768       ['block_12_expand[0][0]']     
 ormalization)                                                                                    
                                                                                                  
 block_12_expand_relu (ReLU  (None, 14, 14, 192)          0         ['block_12_expand_BN[0][0]']  
 )                                                                                                
                                                                                                  
 block_12_depthwise (Depthw  (None, 14, 14, 192)          1728      ['block_12_expand_relu[0][0]']
 iseConv2D)                                                                                       
                                                                                                  
 block_12_depthwise_BN (Bat  (None, 14, 14, 192)          768       ['block_12_depthwise[0][0]']  
 chNormalization)                                                                                 
                                                                                                  
 block_12_depthwise_relu (R  (None, 14, 14, 192)          0         ['block_12_depthwise_BN[0][0]'
 eLU)                                                               ]                             
                                                                                                  
 block_12_project (Conv2D)   (None, 14, 14, 32)           6144      ['block_12_depthwise_relu[0][0
                                                                    ]']                           
                                                                                                  
 block_12_project_BN (Batch  (None, 14, 14, 32)           128       ['block_12_project[0][0]']    
 Normalization)                                                                                   
                                                                                                  
 block_12_add (Add)          (None, 14, 14, 32)           0         ['block_11_add[0][0]',        
                                                                     'block_12_project_BN[0][0]'] 
                                                                                                  
 block_13_expand (Conv2D)    (None, 14, 14, 192)          6144      ['block_12_add[0][0]']        
                                                                                                  
 block_13_expand_BN (BatchN  (None, 14, 14, 192)          768       ['block_13_expand[0][0]']     
 ormalization)                                                                                    
                                                                                                  
 block_13_expand_relu (ReLU  (None, 14, 14, 192)          0         ['block_13_expand_BN[0][0]']  
 )                                                                                                
                                                                                                  
 up_sampling2d (UpSampling2  (None, 28, 28, 192)          0         ['block_13_expand_relu[0][0]']
 D)                                                                                               
                                                                                                  
 concatenate (Concatenate)   (None, 28, 28, 288)          0         ['up_sampling2d[0][0]',       
                                                                     'block_6_expand_relu[0][0]'] 
                                                                                                  
 conv2d (Conv2D)             (None, 28, 28, 64)           165952    ['concatenate[0][0]']         
                                                                                                  
 batch_normalization (Batch  (None, 28, 28, 64)           256       ['conv2d[0][0]']              
 Normalization)                                                                                   
                                                                                                  
 activation (Activation)     (None, 28, 28, 64)           0         ['batch_normalization[0][0]'] 
                                                                                                  
 conv2d_1 (Conv2D)           (None, 28, 28, 64)           36928     ['activation[0][0]']          
                                                                                                  
 batch_normalization_1 (Bat  (None, 28, 28, 64)           256       ['conv2d_1[0][0]']            
 chNormalization)                                                                                 
                                                                                                  
 activation_1 (Activation)   (None, 28, 28, 64)           0         ['batch_normalization_1[0][0]'
                                                                    ]                             
                                                                                                  
 up_sampling2d_1 (UpSamplin  (None, 56, 56, 64)           0         ['activation_1[0][0]']        
 g2D)                                                                                             
                                                                                                  
 concatenate_1 (Concatenate  (None, 56, 56, 112)          0         ['up_sampling2d_1[0][0]',     
 )                                                                   'block_3_expand_relu[0][0]'] 
                                                                                                  
 conv2d_2 (Conv2D)           (None, 56, 56, 48)           48432     ['concatenate_1[0][0]']       
                                                                                                  
 batch_normalization_2 (Bat  (None, 56, 56, 48)           192       ['conv2d_2[0][0]']            
 chNormalization)                                                                                 
                                                                                                  
 activation_2 (Activation)   (None, 56, 56, 48)           0         ['batch_normalization_2[0][0]'
                                                                    ]                             
                                                                                                  
 conv2d_3 (Conv2D)           (None, 56, 56, 48)           20784     ['activation_2[0][0]']        
                                                                                                  
 batch_normalization_3 (Bat  (None, 56, 56, 48)           192       ['conv2d_3[0][0]']            
 chNormalization)                                                                                 
                                                                                                  
 activation_3 (Activation)   (None, 56, 56, 48)           0         ['batch_normalization_3[0][0]'
                                                                    ]                             
                                                                                                  
 up_sampling2d_2 (UpSamplin  (None, 112, 112, 48)         0         ['activation_3[0][0]']        
 g2D)                                                                                             
                                                                                                  
 concatenate_2 (Concatenate  (None, 112, 112, 96)         0         ['up_sampling2d_2[0][0]',     
 )                                                                   'block_1_expand_relu[0][0]'] 
                                                                                                  
 conv2d_4 (Conv2D)           (None, 112, 112, 32)         27680     ['concatenate_2[0][0]']       
                                                                                                  
 batch_normalization_4 (Bat  (None, 112, 112, 32)         128       ['conv2d_4[0][0]']            
 chNormalization)                                                                                 
                                                                                                  
 activation_4 (Activation)   (None, 112, 112, 32)         0         ['batch_normalization_4[0][0]'
                                                                    ]                             
                                                                                                  
 conv2d_5 (Conv2D)           (None, 112, 112, 32)         9248      ['activation_4[0][0]']        
                                                                                                  
 batch_normalization_5 (Bat  (None, 112, 112, 32)         128       ['conv2d_5[0][0]']            
 chNormalization)                                                                                 
                                                                                                  
 activation_5 (Activation)   (None, 112, 112, 32)         0         ['batch_normalization_5[0][0]'
                                                                    ]                             
                                                                                                  
 up_sampling2d_3 (UpSamplin  (None, 224, 224, 32)         0         ['activation_5[0][0]']        
 g2D)                                                                                             
                                                                                                  
 concatenate_3 (Concatenate  (None, 224, 224, 35)         0         ['up_sampling2d_3[0][0]',     
 )                                                                   'tf.cast[0][0]']             
                                                                                                  
 conv2d_6 (Conv2D)           (None, 224, 224, 16)         5056      ['concatenate_3[0][0]']       
                                                                                                  
 batch_normalization_6 (Bat  (None, 224, 224, 16)         64        ['conv2d_6[0][0]']            
 chNormalization)                                                                                 
                                                                                                  
 activation_6 (Activation)   (None, 224, 224, 16)         0         ['batch_normalization_6[0][0]'
                                                                    ]                             
                                                                                                  
 conv2d_7 (Conv2D)           (None, 224, 224, 16)         2320      ['activation_6[0][0]']        
                                                                                                  
 batch_normalization_7 (Bat  (None, 224, 224, 16)         64        ['conv2d_7[0][0]']            
 chNormalization)                                                                                 
                                                                                                  
 activation_7 (Activation)   (None, 224, 224, 16)         0         ['batch_normalization_7[0][0]'
                                                                    ]                             
                                                                                                  
 conv2d_8 (Conv2D)           (None, 224, 224, 1)          17        ['activation_7[0][0]']        
                                                                                                  
 activation_8 (Activation)   (None, 224, 224, 1)          0         ['conv2d_8[0][0]']            
                                                                                                  
==================================================================================================
Total params: 416209 (1.59 MB)
Trainable params: 317057 (1.21 MB)
Non-trainable params: 99152 (387.31 KB)
__________________________________________________________________________________________________
In [ ]:
#Show the model architecture
if RunningInCOLAB:
    dot_img_file = '/content/drive/MyDrive/Tuhin/AI-ML Course - UT Austin/Projects/8-Computer Vision/model.png' # Google Drive path
else:
    dot_img_file = 'model.png' # Local path
tf.keras.utils.plot_model(model, to_file=dot_img_file, show_shapes=True, show_layer_activations=True, show_trainable=True)
Out[ ]:
No description has been provided for this image
Q. 2.B. Design your own Dice Coefficient and Loss function.¶
In [ ]:
smooth = 1e-15 #@param
def dice_coef(y_true, y_pred):
    y_true = tf.keras.layers.Flatten()(y_true)
    y_pred = tf.keras.layers.Flatten()(y_pred)
    intersection = tf.reduce_sum(y_true * y_pred)
    return (2. * intersection + smooth) / (tf.reduce_sum(y_true) + tf.reduce_sum(y_pred) + smooth)

def dice_loss(y_true, y_pred):
    return 1.0 - dice_coef(y_true, y_pred)
Q. 2.C. Train and tune the model as required.¶
In [ ]:
opt = tf.keras.optimizers.Nadam(learning_rate)
metrics = [dice_coef, tf.keras.metrics.Recall(), tf.keras.metrics.Precision()]
model.compile(loss=dice_loss, optimizer=opt, metrics=metrics)
In [ ]:
callbacks = [
    tf.keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=4),
    tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=False)
]
In [ ]:
train_steps = len(X_train)//batch_size
valid_steps = len(X_test)//batch_size

if len(X_train) % batch_size != 0:
    train_steps += 1
if len(X_test) % batch_size != 0:
    valid_steps += 1


model_run=model.fit(
    X_train, Y_train,
    validation_data=(X_test, Y_test),
    epochs=epochs,
    steps_per_epoch=train_steps,
    validation_steps=valid_steps,
    callbacks=callbacks
)
Epoch 1/500
50/50 [==============================] - 20s 80ms/step - loss: 0.7813 - dice_coef: 0.2188 - recall: 0.3800 - precision: 0.2178 - val_loss: 0.6826 - val_dice_coef: 0.3174 - val_recall: 0.8907 - val_precision: 0.2554 - lr: 1.0000e-04
Epoch 2/500
50/50 [==============================] - 1s 21ms/step - loss: 0.7163 - dice_coef: 0.2838 - recall: 0.6181 - precision: 0.2810 - val_loss: 0.6085 - val_dice_coef: 0.3915 - val_recall: 0.7968 - val_precision: 0.4823 - lr: 1.0000e-04
Epoch 3/500
50/50 [==============================] - 1s 21ms/step - loss: 0.6485 - dice_coef: 0.3515 - recall: 0.7867 - precision: 0.3660 - val_loss: 0.5944 - val_dice_coef: 0.4056 - val_recall: 0.8224 - val_precision: 0.5350 - lr: 1.0000e-04
Epoch 4/500
50/50 [==============================] - 1s 21ms/step - loss: 0.5891 - dice_coef: 0.4109 - recall: 0.8478 - precision: 0.4518 - val_loss: 0.5915 - val_dice_coef: 0.4085 - val_recall: 0.8521 - val_precision: 0.5139 - lr: 1.0000e-04
Epoch 5/500
50/50 [==============================] - 1s 21ms/step - loss: 0.5493 - dice_coef: 0.4507 - recall: 0.8629 - precision: 0.4947 - val_loss: 0.5895 - val_dice_coef: 0.4105 - val_recall: 0.7931 - val_precision: 0.5721 - lr: 1.0000e-04
Epoch 6/500
50/50 [==============================] - 1s 21ms/step - loss: 0.5225 - dice_coef: 0.4776 - recall: 0.8641 - precision: 0.5254 - val_loss: 0.5663 - val_dice_coef: 0.4337 - val_recall: 0.8164 - val_precision: 0.5796 - lr: 1.0000e-04
Epoch 7/500
50/50 [==============================] - 1s 20ms/step - loss: 0.5027 - dice_coef: 0.4972 - recall: 0.8722 - precision: 0.5453 - val_loss: 0.5363 - val_dice_coef: 0.4637 - val_recall: 0.8078 - val_precision: 0.6245 - lr: 1.0000e-04
Epoch 8/500
50/50 [==============================] - 1s 21ms/step - loss: 0.4851 - dice_coef: 0.5148 - recall: 0.8721 - precision: 0.5680 - val_loss: 0.5164 - val_dice_coef: 0.4836 - val_recall: 0.7867 - val_precision: 0.6555 - lr: 1.0000e-04
Epoch 9/500
50/50 [==============================] - 1s 20ms/step - loss: 0.4686 - dice_coef: 0.5310 - recall: 0.8746 - precision: 0.5836 - val_loss: 0.5248 - val_dice_coef: 0.4752 - val_recall: 0.6758 - val_precision: 0.7468 - lr: 1.0000e-04
Epoch 10/500
50/50 [==============================] - 1s 20ms/step - loss: 0.4497 - dice_coef: 0.5503 - recall: 0.8785 - precision: 0.6051 - val_loss: 0.5022 - val_dice_coef: 0.4978 - val_recall: 0.7533 - val_precision: 0.6416 - lr: 1.0000e-04
Epoch 11/500
50/50 [==============================] - 1s 20ms/step - loss: 0.4397 - dice_coef: 0.5601 - recall: 0.8807 - precision: 0.6152 - val_loss: 0.5357 - val_dice_coef: 0.4643 - val_recall: 0.5722 - val_precision: 0.8024 - lr: 1.0000e-04
Epoch 12/500
50/50 [==============================] - 1s 21ms/step - loss: 0.4271 - dice_coef: 0.5726 - recall: 0.8813 - precision: 0.6297 - val_loss: 0.4766 - val_dice_coef: 0.5234 - val_recall: 0.7090 - val_precision: 0.6826 - lr: 1.0000e-04
Epoch 13/500
50/50 [==============================] - 1s 22ms/step - loss: 0.4103 - dice_coef: 0.5899 - recall: 0.8846 - precision: 0.6532 - val_loss: 0.4621 - val_dice_coef: 0.5379 - val_recall: 0.8352 - val_precision: 0.5763 - lr: 1.0000e-04
Epoch 14/500
50/50 [==============================] - 1s 20ms/step - loss: 0.3923 - dice_coef: 0.6073 - recall: 0.8969 - precision: 0.6601 - val_loss: 0.4741 - val_dice_coef: 0.5259 - val_recall: 0.6496 - val_precision: 0.7528 - lr: 1.0000e-04
Epoch 15/500
50/50 [==============================] - 1s 21ms/step - loss: 0.3881 - dice_coef: 0.6117 - recall: 0.8878 - precision: 0.6697 - val_loss: 0.5142 - val_dice_coef: 0.4858 - val_recall: 0.5323 - val_precision: 0.7951 - lr: 1.0000e-04
Epoch 16/500
50/50 [==============================] - 1s 21ms/step - loss: 0.3740 - dice_coef: 0.6263 - recall: 0.8937 - precision: 0.6882 - val_loss: 0.4614 - val_dice_coef: 0.5386 - val_recall: 0.6619 - val_precision: 0.7346 - lr: 1.0000e-04
Epoch 17/500
50/50 [==============================] - 1s 21ms/step - loss: 0.3600 - dice_coef: 0.6400 - recall: 0.8994 - precision: 0.6985 - val_loss: 0.5676 - val_dice_coef: 0.4324 - val_recall: 0.4280 - val_precision: 0.8428 - lr: 1.0000e-04
Epoch 18/500
50/50 [==============================] - 1s 22ms/step - loss: 0.3446 - dice_coef: 0.6555 - recall: 0.8980 - precision: 0.7203 - val_loss: 0.4159 - val_dice_coef: 0.5841 - val_recall: 0.7333 - val_precision: 0.6615 - lr: 1.0000e-04
Epoch 19/500
50/50 [==============================] - 1s 20ms/step - loss: 0.3418 - dice_coef: 0.6583 - recall: 0.8999 - precision: 0.7167 - val_loss: 0.4193 - val_dice_coef: 0.5807 - val_recall: 0.7312 - val_precision: 0.6737 - lr: 1.0000e-04
Epoch 20/500
50/50 [==============================] - 1s 20ms/step - loss: 0.3257 - dice_coef: 0.6740 - recall: 0.9003 - precision: 0.7375 - val_loss: 0.4519 - val_dice_coef: 0.5481 - val_recall: 0.6209 - val_precision: 0.7225 - lr: 1.0000e-04
Epoch 21/500
50/50 [==============================] - 1s 20ms/step - loss: 0.3169 - dice_coef: 0.6827 - recall: 0.8995 - precision: 0.7484 - val_loss: 0.4186 - val_dice_coef: 0.5814 - val_recall: 0.6959 - val_precision: 0.6862 - lr: 1.0000e-04
Epoch 22/500
50/50 [==============================] - 1s 21ms/step - loss: 0.3066 - dice_coef: 0.6934 - recall: 0.9026 - precision: 0.7565 - val_loss: 0.4054 - val_dice_coef: 0.5946 - val_recall: 0.7827 - val_precision: 0.6048 - lr: 1.0000e-04
Epoch 23/500
50/50 [==============================] - 1s 20ms/step - loss: 0.2924 - dice_coef: 0.7076 - recall: 0.9089 - precision: 0.7663 - val_loss: 0.5488 - val_dice_coef: 0.4512 - val_recall: 0.4000 - val_precision: 0.8426 - lr: 1.0000e-04
Epoch 24/500
50/50 [==============================] - 1s 21ms/step - loss: 0.2880 - dice_coef: 0.7122 - recall: 0.9081 - precision: 0.7771 - val_loss: 0.5325 - val_dice_coef: 0.4675 - val_recall: 0.4121 - val_precision: 0.8302 - lr: 1.0000e-04
Epoch 25/500
50/50 [==============================] - 1s 20ms/step - loss: 0.2774 - dice_coef: 0.7227 - recall: 0.9024 - precision: 0.7866 - val_loss: 0.4373 - val_dice_coef: 0.5627 - val_recall: 0.5904 - val_precision: 0.7232 - lr: 1.0000e-04
Epoch 26/500
50/50 [==============================] - 1s 21ms/step - loss: 0.2764 - dice_coef: 0.7239 - recall: 0.8976 - precision: 0.7944 - val_loss: 0.3774 - val_dice_coef: 0.6226 - val_recall: 0.7440 - val_precision: 0.6529 - lr: 1.0000e-04
Epoch 27/500
50/50 [==============================] - 1s 21ms/step - loss: 0.2613 - dice_coef: 0.7388 - recall: 0.9118 - precision: 0.7979 - val_loss: 0.4698 - val_dice_coef: 0.5302 - val_recall: 0.5066 - val_precision: 0.7732 - lr: 1.0000e-04
Epoch 28/500
50/50 [==============================] - 1s 21ms/step - loss: 0.2565 - dice_coef: 0.7438 - recall: 0.9062 - precision: 0.8093 - val_loss: 0.4123 - val_dice_coef: 0.5877 - val_recall: 0.6948 - val_precision: 0.6284 - lr: 1.0000e-04
Epoch 29/500
50/50 [==============================] - 1s 21ms/step - loss: 0.2376 - dice_coef: 0.7624 - recall: 0.9146 - precision: 0.8183 - val_loss: 0.4958 - val_dice_coef: 0.5042 - val_recall: 0.4703 - val_precision: 0.7650 - lr: 1.0000e-04
Epoch 30/500
50/50 [==============================] - 1s 21ms/step - loss: 0.2322 - dice_coef: 0.7677 - recall: 0.9130 - precision: 0.8277 - val_loss: 0.4888 - val_dice_coef: 0.5112 - val_recall: 0.4827 - val_precision: 0.7509 - lr: 1.0000e-04
Epoch 31/500
50/50 [==============================] - 1s 21ms/step - loss: 0.2207 - dice_coef: 0.7794 - recall: 0.9190 - precision: 0.8436 - val_loss: 0.4484 - val_dice_coef: 0.5516 - val_recall: 0.5430 - val_precision: 0.7413 - lr: 1.0000e-05
Epoch 32/500
50/50 [==============================] - 1s 20ms/step - loss: 0.2204 - dice_coef: 0.7798 - recall: 0.9177 - precision: 0.8471 - val_loss: 0.4219 - val_dice_coef: 0.5781 - val_recall: 0.6082 - val_precision: 0.6887 - lr: 1.0000e-05
Epoch 33/500
50/50 [==============================] - 1s 20ms/step - loss: 0.2280 - dice_coef: 0.7722 - recall: 0.9174 - precision: 0.8387 - val_loss: 0.4377 - val_dice_coef: 0.5623 - val_recall: 0.5600 - val_precision: 0.7334 - lr: 1.0000e-05
Epoch 34/500
50/50 [==============================] - 1s 21ms/step - loss: 0.2191 - dice_coef: 0.7809 - recall: 0.9171 - precision: 0.8469 - val_loss: 0.4200 - val_dice_coef: 0.5800 - val_recall: 0.5930 - val_precision: 0.7195 - lr: 1.0000e-05
Epoch 35/500
50/50 [==============================] - 1s 21ms/step - loss: 0.2212 - dice_coef: 0.7789 - recall: 0.9195 - precision: 0.8428 - val_loss: 0.4297 - val_dice_coef: 0.5703 - val_recall: 0.5698 - val_precision: 0.7374 - lr: 1.0000e-06
Epoch 36/500
50/50 [==============================] - 1s 21ms/step - loss: 0.2253 - dice_coef: 0.7746 - recall: 0.9128 - precision: 0.8445 - val_loss: 0.4334 - val_dice_coef: 0.5666 - val_recall: 0.5625 - val_precision: 0.7426 - lr: 1.0000e-06
Q. 2.D. Evaluate and share insights on performance of the model.¶
In [ ]:
test_steps = (len(X_test)//batch_size)
if len(X_test) % batch_size != 0:
    test_steps += 1

model.evaluate(X_test, Y_test, steps=test_steps)
2/2 [==============================] - 0s 12ms/step - loss: 0.4334 - dice_coef: 0.5666 - recall: 0.5625 - precision: 0.7426
Out[ ]:
[0.4333670139312744,
 0.5666329860687256,
 0.5624674558639526,
 0.7426241636276245]
In [ ]:
#Show the training vs validation loss
plt.figure(figsize=(10, 5))
plt.plot(model_run.history['loss'], label='Training Loss')
plt.plot(model_run.history['val_loss'], label='Validation Loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend()
plt.show()

#Show the training vs validation dice coef over the epochs
plt.plot(model_run.history['dice_coef'], label='Training Dice Coef')
plt.plot(model_run.history['val_dice_coef'], label='Validation Dice Coef')
plt.plot(model_run.history['recall'], label='Training Recall')
plt.plot(model_run.history['val_recall'], label='Validation Recall')
plt.plot(model_run.history['precision'], label='Training Precision')
plt.plot(model_run.history['val_precision'], label='Validation Precision')
plt.title('Model Metrics')
plt.ylabel('Metrics')
plt.xlabel('Epoch')
plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left', borderaxespad=0.)
plt.show()
No description has been provided for this image
No description has been provided for this image
Insight of the performance of the model¶
  • The model has been trained on 400 images and tested on 9 images.
  • The model has been trained on 500 epochs with batch size of 8, but training ran finished early, as loss is getting flattened out.
  • During the run, training loss declined steadily, but validation loss had some up and down before it stabilized and flattened out.
  • Training recall reached a steady value quickly, but validation recall declined over period.
  • But precision of training and validation is more less same, stabilized within 10 epochs, however validation precision had lot of ups and down at end.
  • Dice coeefficient for both training and testing imporved together till around 10 epoch and then flattened out.

3. Test the model predictions on the test image: ‘image with index 3 in the test data’ and visualise the predicted masks on the faces in the image.[2 Marks]¶

In [ ]:
def show_original_vs_predicted_face_area(i):
    fig, axs = plt.subplots(1, 2, figsize=(10, 5))

    test_image = np.copy(X_test[i])
    #Draw a countour around detected face in test_image from the predicted masked area
    contours, _ = cv2.findContours(Y_test[i].astype(np.uint8), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
    cv2.drawContours(test_image, contours, -1, (0, 255, 0), 1)

    axs[0].imshow((test_image/255).astype(np.float32))
    axs[0].set_title("Original Image with Labelled Face Contour")
    axs[0].axis('off')

    test_image = np.copy(X_test[i])

    Y_pred = model.predict(np.array([test_image]))
    pred_mask = cv2.resize((1.0*(Y_pred[0] > 0.5)), (image_width,image_height))

    #Draw a countour around detected face in test_image from the predicted masked area
    contours, _ = cv2.findContours(pred_mask.astype(np.uint8), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
    cv2.drawContours(test_image, contours, -1, (0, 255, 0), 1)

    axs[1].imshow((test_image/255).astype(np.float32))
    axs[1].set_title("Predicted Face Area in Original Image")
    axs[1].axis('off')

    plt.show()


#Show third image
show_original_vs_predicted_face_area(3)
1/1 [==============================] - 1s 1s/step
No description has been provided for this image
In [ ]:
del data
del X
del Y
del model

PART B - 10 Marks¶

  • DOMAIN: Entertainment

  • CONTEXT: Company X owns a movie application and repository which caters movie streaming to millions of users who on subscription basis. Company wants to automate the process of cast and crew information in each scene from a movie such that when a user pauses on the movie and clicks on cast information button, the app will show details of the actor in the scene. Company has an in-house computer vision and multimedia experts who need to detect faces from screen shots from the movie scene. The data labelling is already done.

  • DATA DESCRIPTION: The dataset comprises of images and its mask for corresponding human face.

  • PROJECT OBJECTIVE: To create an image dataset to be used by AI team build an image classi fier data. Profile images of people are given.

Steps and tasks: [ Total Score: 10 Marks]¶

1. Read/import images from folder ‘training_images’. [2 Marks]¶

In [ ]:
#Import images from folder 'training_images'

#Unzip the file training_images.zip into session
if RunningInCOLAB:
    training_images_zip = '/content/drive/MyDrive/Tuhin/AI-ML Course - UT Austin/Projects/8-Computer Vision/training_images.zip' # Google Drive path
else:
    training_images_zip = 'training_images.zip'

#We will extract to local folder, loading image from google drive is time consuming
training_images_extract_folder = 'training_images'

with zipfile.ZipFile(training_images_zip, 'r') as zip_ref:
    zip_ref.extractall(training_images_extract_folder)

training_images  = os.listdir(training_images_extract_folder+'/training_images')
print(training_images)
['real_00176.jpg', 'real_00379.jpg', 'real_00686.jpg', 'real_00566.jpg', 'real_00319.jpg', 'real_00820.jpg', 'real_00041.jpg', 'real_00901.jpg', 'real_00109.jpg', 'real_00624.jpg', 'real_00398.jpg', 'real_00720.jpg', 'real_00145.jpg', 'real_00841.jpg', 'real_00010.jpg', 'real_00313.jpg', 'real_00090.jpg', 'real_00840.jpg', 'real_00215.jpg', 'real_00128.jpg', 'real_01050.jpg', 'real_00753.jpg', 'real_00298.jpg', 'real_00076.jpg', 'real_00574.jpg', 'real_00413.jpg', 'real_00049.jpg', 'real_00315.jpg', 'real_00250.jpg', 'real_00629.jpg', 'real_01064.jpg', 'real_00667.jpg', 'real_00927.jpg', 'real_00567.jpg', 'real_00543.jpg', 'real_00410.jpg', 'real_00485.jpg', 'real_00865.jpg', 'real_00706.jpg', 'real_00406.jpg', 'real_00874.jpg', 'real_00972.jpg', 'real_00259.jpg', 'real_00437.jpg', 'real_01019.jpg', 'real_00210.jpg', 'real_00992.jpg', 'real_00609.jpg', 'real_00106.jpg', 'real_00494.jpg', 'real_00687.jpg', 'real_00237.jpg', 'real_00356.jpg', 'real_00487.jpg', 'real_00480.jpg', 'real_00321.jpg', 'real_00732.jpg', 'real_00929.jpg', 'real_00769.jpg', 'real_01009.jpg', 'real_00716.jpg', 'real_00081.jpg', 'real_00715.jpg', 'real_00636.jpg', 'real_01069.jpg', 'real_00851.jpg', 'real_00149.jpg', 'real_00352.jpg', 'real_00014.jpg', 'real_00801.jpg', 'real_01007.jpg', 'real_00895.jpg', 'real_00299.jpg', 'real_00218.jpg', 'real_00110.jpg', 'real_00294.jpg', 'real_00898.jpg', 'real_00074.jpg', 'real_00345.jpg', 'real_00292.jpg', 'real_00071.jpg', 'real_00497.jpg', 'real_00476.jpg', 'real_00272.jpg', 'real_01053.jpg', 'real_00302.jpg', 'real_00529.jpg', 'real_00942.jpg', 'real_00516.jpg', 'real_00863.jpg', 'real_00909.jpg', 'real_00538.jpg', 'real_00160.jpg', 'real_00515.jpg', 'real_00220.jpg', 'real_00499.jpg', 'real_00893.jpg', 'real_00565.jpg', 'real_00894.jpg', 'real_00020.jpg', 'real_00492.jpg', 'real_01047.jpg', 'real_00853.jpg', 'real_00703.jpg', 'real_00118.jpg', 'real_00450.jpg', 'real_00951.jpg', 'real_00105.jpg', 'real_00167.jpg', 'real_00258.jpg', 'real_00306.jpg', 'real_00045.jpg', 'real_00734.jpg', 'real_00617.jpg', 'real_00270.jpg', 'real_01032.jpg', 'real_00318(1).jpg', 'real_01002.jpg', 'real_00905.jpg', 'real_00827.jpg', 'real_00002.jpg', 'real_01026.jpg', 'real_00839.jpg', 'real_00330.jpg', 'real_00233.jpg', 'real_00811.jpg', 'real_00229.jpg', 'real_00203.jpg', 'real_00752.jpg', 'real_00659.jpg', 'real_00603.jpg', 'real_00681.jpg', 'real_00301.jpg', 'real_00148.jpg', 'real_00336.jpg', 'real_00494(1).jpg', 'real_00795.jpg', 'real_00281.jpg', 'real_00606.jpg', 'real_00456.jpg', 'real_00243.jpg', 'real_00810.jpg', 'real_00957.jpg', 'real_00478.jpg', 'real_00368.jpg', 'real_01059.jpg', 'real_00709.jpg', 'real_00007.jpg', 'real_00787.jpg', 'real_00824.jpg', 'real_00366.jpg', 'real_00171.jpg', 'real_00372.jpg', 'real_00219.jpg', 'real_00028.jpg', 'real_00594.jpg', 'real_00676.jpg', 'real_00344.jpg', 'real_00449.jpg', 'real_00426.jpg', 'real_00387.jpg', 'real_00702.jpg', 'real_00742.jpg', 'real_00268.jpg', 'real_00405.jpg', 'real_00758.jpg', 'real_00190.jpg', 'real_00073.jpg', 'real_00685.jpg', 'real_00445.jpg', 'real_00597.jpg', 'real_00639.jpg', 'real_00044.jpg', 'real_00648.jpg', 'real_00349.jpg', 'real_00936.jpg', 'real_00322.jpg', 'real_00757.jpg', 'real_00390.jpg', 'real_00392.jpg', 'real_00825.jpg', 'real_00961.jpg', 'real_00378.jpg', 'real_00989.jpg', 'real_00260.jpg', 'real_00943.jpg', 'real_00868.jpg', 'real_01036.jpg', 'real_00990.jpg', 'real_00251.jpg', 'real_00319(1).jpg', 'real_01076.jpg', 'real_00083.jpg', 'real_00970.jpg', 'real_00483.jpg', 'real_00604.jpg', 'real_00969.jpg', 'real_00711.jpg', 'real_01029.jpg', 'real_00558.jpg', 'real_00138.jpg', 'real_00756.jpg', 'real_00236.jpg', 'real_00424.jpg', 'real_00584.jpg', 'real_00470.jpg', 'real_00586.jpg', 'real_00133.jpg', 'real_00920.jpg', 'real_00781.jpg', 'real_01041.jpg', 'real_01042.jpg', 'real_00843.jpg', 'real_01011.jpg', 'real_00303.jpg', 'real_00239.jpg', 'real_00193.jpg', 'real_00448.jpg', 'real_01037.jpg', 'real_01056.jpg', 'real_00269.jpg', 'real_00572.jpg', 'real_00844.jpg', 'real_00647.jpg', 'real_00198.jpg', 'real_00510.jpg', 'real_00669.jpg', 'real_00504.jpg', 'real_00830.jpg', 'real_01065.jpg', 'real_00391.jpg', 'real_00054.jpg', 'real_00802.jpg', 'real_00441.jpg', 'real_00748.jpg', 'real_00358.jpg', 'real_00157.jpg', 'real_01075.jpg', 'real_00671.jpg', 'real_00842.jpg', 'real_00418.jpg', 'real_00025.jpg', 'real_00384.jpg', 'real_00812.jpg', 'real_00767.jpg', 'real_00438.jpg', 'real_00500.jpg', 'real_00876.jpg', 'real_00038.jpg', 'real_00797.jpg', 'real_00228.jpg', 'real_00389.jpg', 'real_00245.jpg', 'real_00540.jpg', 'real_00654.jpg', 'real_00238.jpg', 'real_00937.jpg', 'real_00678.jpg', 'real_00974.jpg', 'real_00147.jpg', 'real_00482.jpg', 'real_00253.jpg', 'real_00733.jpg', 'real_00785.jpg', 'real_00608.jpg', 'real_00353.jpg', 'real_00455.jpg', 'real_00741.jpg', 'real_00635.jpg', 'real_00182.jpg', 'real_00607.jpg', 'real_00354.jpg', 'real_00430.jpg', 'real_00231.jpg', 'real_00993.jpg', 'real_00271.jpg', 'real_00694.jpg', 'real_01071.jpg', 'real_00367.jpg', 'real_00030.jpg', 'real_00693.jpg', 'real_00295.jpg', 'real_00411.jpg', 'real_01039.jpg', 'real_00123.jpg', 'real_00690.jpg', 'real_00408.jpg', 'real_00786.jpg', 'real_00775.jpg', 'real_00650.jpg', 'real_00532.jpg', 'real_00360.jpg', 'real_00247.jpg', 'real_00195.jpg', 'real_01024.jpg', 'real_00443.jpg', 'real_00205.jpg', 'real_00064.jpg', 'real_00495(1).jpg', 'real_00015.jpg', 'real_01012.jpg', 'real_00394.jpg', 'real_00980.jpg', 'real_00794.jpg', 'real_00699.jpg', 'real_00869.jpg', 'real_00834.jpg', 'real_00001.jpg', 'real_00503.jpg', 'real_00048.jpg', 'real_00040.jpg', 'real_00057.jpg', 'real_00618.jpg', 'real_00968.jpg', 'real_00446.jpg', 'real_00587.jpg', 'real_00434.jpg', 'real_00616.jpg', 'real_00158.jpg', 'real_00773.jpg', 'real_00953.jpg', 'real_00063.jpg', 'real_00161.jpg', 'real_00453.jpg', 'real_00933.jpg', 'real_00377.jpg', 'real_00745.jpg', 'real_00871.jpg', 'real_00273.jpg', 'real_00900.jpg', 'real_00744.jpg', 'real_00380.jpg', 'real_00029.jpg', 'real_00530.jpg', 'real_00556.jpg', 'real_00196.jpg', 'real_00480(1).jpg', 'real_00283.jpg', 'real_00870.jpg', 'real_01007(1).jpg', 'real_00474.jpg', 'real_01078.jpg', 'real_00945.jpg', 'real_00808.jpg', 'real_00055.jpg', 'real_00230.jpg', 'real_01043.jpg', 'real_00006.jpg', 'real_00102.jpg', 'real_00101.jpg', 'real_00541.jpg', 'real_00060.jpg', 'real_00293.jpg', 'real_00926.jpg', 'real_00267.jpg', 'real_00388.jpg', 'real_00018.jpg', 'real_01046.jpg', 'real_00466.jpg', 'real_00496.jpg', 'real_00144.jpg', 'real_00508.jpg', 'real_01033.jpg', 'real_00407.jpg', 'real_00442.jpg', 'real_00804.jpg', 'real_00168.jpg', 'real_01027.jpg', 'real_00796.jpg', 'real_00978.jpg', 'real_00340.jpg', 'real_00337.jpg', 'real_00559.jpg', 'real_00774.jpg', 'real_01006.jpg', 'real_00427.jpg', 'real_00822.jpg', 'real_00008.jpg', 'real_00727.jpg', 'real_00150.jpg', 'real_00206.jpg', 'real_00925.jpg', 'real_01013(1).jpg', 'real_00944.jpg', 'real_00404.jpg', 'real_00486.jpg', 'real_00619.jpg', 'real_00816.jpg', 'real_00975.jpg', 'real_00922.jpg', 'real_00717.jpg', 'real_00498.jpg', 'real_00433.jpg', 'real_01021.jpg', 'real_00664.jpg', 'real_00985.jpg', 'real_00576.jpg', 'real_00112.jpg', 'real_00581.jpg', 'real_00525.jpg', 'real_00459.jpg', 'real_00502.jpg', 'real_00582.jpg', 'real_00092.jpg', 'real_00127.jpg', 'real_00093.jpg', 'real_00019.jpg', 'real_00882.jpg', 'real_00921.jpg', 'real_00493.jpg', 'real_00254.jpg', 'real_01014.jpg', 'real_00518.jpg', 'real_00114.jpg', 'real_00755.jpg', 'real_00738.jpg', 'real_00374.jpg', 'real_00338.jpg', 'real_00409.jpg', 'real_00139.jpg', 'real_00964.jpg', 'real_00962.jpg', 'real_00862.jpg', 'real_00341.jpg', 'real_00070.jpg', 'real_00416.jpg', 'real_00013.jpg', 'real_00039.jpg', 'real_00915.jpg', 'real_00194.jpg', 'real_00152.jpg', 'real_00657.jpg', 'real_00402.jpg', 'real_00332.jpg', 'real_00108.jpg', 'real_00670.jpg', 'real_00429.jpg', 'real_00221.jpg', 'real_00335.jpg', 'real_01060.jpg', 'real_00117.jpg', 'real_00622.jpg', 'real_00881.jpg', 'real_00548.jpg', 'real_00523.jpg', 'real_00350.jpg', 'real_00464.jpg', 'real_00363.jpg', 'real_00712.jpg', 'real_00549.jpg', 'real_00386.jpg', 'real_00890.jpg', 'real_01023.jpg', 'real_00719.jpg', 'real_00946.jpg', 'real_00199.jpg', 'real_00731.jpg', 'real_00873.jpg', 'real_00479.jpg', 'real_00300.jpg', 'real_00201.jpg', 'real_00701.jpg', 'real_00146.jpg', 'real_00762.jpg', 'real_00817.jpg', 'real_01031.jpg', 'real_00573.jpg', 'real_00782.jpg', 'real_00979.jpg', 'real_00163.jpg', 'real_00856.jpg', 'real_00723.jpg', 'real_00373.jpg', 'real_00638.jpg', 'real_01040.jpg', 'real_00561.jpg', 'real_00778.jpg', 'real_00803.jpg', 'real_00209.jpg', 'real_00080.jpg', 'real_00628.jpg', 'real_01052.jpg', 'real_00451.jpg', 'real_00591.jpg', 'real_00444.jpg', 'real_00677.jpg', 'real_00533.jpg', 'real_00222.jpg', 'real_00475.jpg', 'real_00312.jpg', 'real_00784.jpg', 'real_00959.jpg', 'real_00704.jpg', 'real_00736.jpg', 'real_00697.jpg', 'real_00130.jpg', 'real_00460.jpg', 'real_01000.jpg', 'real_00960.jpg', 'real_00779.jpg', 'real_00134.jpg', 'real_00417.jpg', 'real_01074.jpg', 'real_00952.jpg', 'real_01066.jpg', 'real_00627.jpg', 'real_00506.jpg', 'real_00714.jpg', 'real_00861.jpg', 'real_00973.jpg', 'real_00911.jpg', 'real_00632.jpg', 'real_00180.jpg', 'real_00513.jpg', 'real_00412.jpg', 'real_00517.jpg', 'real_00605.jpg', 'real_00454.jpg', 'real_00896.jpg', 'real_00113.jpg', 'real_00275.jpg', 'real_00836.jpg', 'real_00003.jpg', 'real_00954.jpg', 'real_00805.jpg', 'real_00598.jpg', 'real_00527.jpg', 'real_00234.jpg', 'real_00547.jpg', 'real_00156.jpg', 'real_00791.jpg', 'real_00115.jpg', 'real_00554.jpg', 'real_00771.jpg', 'real_00175.jpg', 'real_00375.jpg', 'real_00866.jpg', 'real_00847.jpg', 'real_00643.jpg', 'real_00181.jpg', 'real_00491.jpg', 'real_00369.jpg', 'real_00277.jpg', 'real_00545.jpg', 'real_00991.jpg', 'real_00708.jpg', 'real_00596.jpg', 'real_00912.jpg', 'real_00280.jpg', 'real_00759.jpg', 'real_01035.jpg', 'real_00718.jpg', 'real_00122.jpg', 'real_00382.jpg', 'real_00179.jpg', 'real_00845.jpg', 'real_00461.jpg', 'real_00351.jpg', 'real_00585.jpg', 'real_00296.jpg', 'real_00432.jpg', 'real_00789.jpg', 'real_00421.jpg', 'real_00457.jpg', 'real_00241.jpg', 'real_00878.jpg', 'real_00425.jpg', 'real_00763.jpg', 'real_00263.jpg', 'real_00949.jpg', 'real_00754.jpg', 'real_00032.jpg', 'real_00948.jpg', 'real_00305.jpg', 'real_00422.jpg', 'real_00274.jpg', 'real_00770.jpg', 'real_00806.jpg', 'real_00068.jpg', 'real_00534.jpg', 'real_00289.jpg', 'real_00348.jpg', 'real_00397.jpg', 'real_00546.jpg', 'real_00735.jpg', 'real_00435.jpg', 'real_00511.jpg', 'real_00520.jpg', 'real_00107.jpg', 'real_00065.jpg', 'real_00776.jpg', 'real_01038.jpg', 'real_00257.jpg', 'real_01055.jpg', 'real_00852.jpg', 'real_00884.jpg', 'real_01061.jpg', 'real_00309.jpg', 'real_00248.jpg', 'real_00995.jpg', 'real_00189.jpg', 'real_00564.jpg', 'real_00679.jpg', 'real_00046.jpg', 'real_00021.jpg', 'real_00907.jpg', 'real_00085.jpg', 'real_00077.jpg', 'real_00692.jpg', 'real_00066.jpg', 'real_00339.jpg', 'real_00560.jpg', 'real_00652.jpg', 'real_00088.jpg', 'real_00683.jpg', 'real_01072.jpg', 'real_00987.jpg', 'real_00111.jpg', 'real_00579.jpg', 'real_00097.jpg', 'real_00640.jpg', 'real_00467.jpg', 'real_00854.jpg', 'real_00428.jpg', 'real_00489.jpg', 'real_00726.jpg', 'real_01008.jpg', 'real_00223.jpg', 'real_00314.jpg', 'real_00646.jpg', 'real_00580.jpg', 'real_00695.jpg', 'real_00216.jpg', 'real_00343.jpg', 'real_00923.jpg', 'real_00832.jpg', 'real_00749.jpg', 'real_00027.jpg', 'real_00333.jpg', 'real_00965.jpg', 'real_00244.jpg', 'real_00126.jpg', 'real_01044.jpg', 'real_00994.jpg', 'real_00225.jpg', 'real_00266.jpg', 'real_00342.jpg', 'real_00809.jpg', 'real_00415.jpg', 'real_00252.jpg', 'real_00170.jpg', 'real_00838.jpg', 'real_01003.jpg', 'real_00120.jpg', 'real_00012.jpg', 'real_00361.jpg', 'real_00154.jpg', 'real_00674.jpg', 'real_00286.jpg', 'real_00099.jpg', 'real_00986.jpg', 'real_00983.jpg', 'real_00287.jpg', 'real_00569.jpg', 'real_00725.jpg', 'real_01005.jpg', 'real_00282.jpg', 'real_00747.jpg', 'real_00151.jpg', 'real_00780.jpg', 'real_00680.jpg', 'real_00005.jpg', 'real_00061.jpg', 'real_00570.jpg', 'real_00792.jpg', 'real_00630.jpg', 'real_00100.jpg', 'real_00610.jpg', 'real_00739.jpg', 'real_00550.jpg', 'real_00599.jpg', 'real_00713.jpg', 'real_00043.jpg', 'real_00939.jpg', 'real_00829.jpg', 'real_00668.jpg', 'real_01017.jpg', 'real_00592.jpg', 'real_00998.jpg', 'real_00177.jpg', 'real_00768.jpg', 'real_00684.jpg', 'real_00172.jpg', 'real_00035.jpg', 'real_00764.jpg', 'real_00393.jpg', 'real_00751.jpg', 'real_00124.jpg', 'real_00164.jpg', 'real_00928.jpg', 'real_00155.jpg', 'real_00700.jpg', 'real_00539.jpg', 'real_00831.jpg', 'real_00169.jpg', 'real_00098.jpg', 'real_00346.jpg', 'real_01062.jpg', 'real_00971.jpg', 'real_00291.jpg', 'real_00662.jpg', 'real_00458.jpg', 'real_00743.jpg', 'real_00656.jpg', 'real_00600.jpg', 'real_01058.jpg', 'real_00793.jpg', 'real_00320.jpg', 'real_00910.jpg', 'real_00173.jpg', 'real_00058.jpg', 'real_00760.jpg', 'real_00036.jpg', 'real_00578.jpg', 'real_00440.jpg', 'real_00477.jpg', 'real_00689.jpg', 'real_00660.jpg', 'real_00956.jpg', 'real_00403.jpg', 'real_00016.jpg', 'real_01070.jpg', 'real_00729.jpg', 'real_00162.jpg', 'real_00846.jpg', 'real_00323.jpg', 'real_00772.jpg', 'real_00096.jpg', 'real_00612.jpg', 'real_00456(1).jpg', 'real_00317.jpg', 'real_00750.jpg', 'real_00137.jpg', 'real_00555.jpg', 'real_00880.jpg', 'real_00947.jpg', 'real_00091.jpg', 'real_00439.jpg', 'real_01016.jpg', 'real_00799.jpg', 'real_00821.jpg', 'real_00818.jpg', 'real_00828.jpg', 'real_00675.jpg', 'real_00310.jpg', 'real_00938.jpg', 'real_00211.jpg', 'real_00140.jpg', 'real_00521.jpg', 'real_00249.jpg', 'real_00528.jpg', 'real_00331(1).jpg', 'real_01025.jpg', 'real_00325.jpg', 'real_00174.jpg', 'real_00278.jpg', 'real_00187.jpg', 'real_00917.jpg', 'real_01020.jpg', 'real_00783.jpg', 'real_00590.jpg', 'real_00601.jpg', 'real_00472.jpg', 'real_00642.jpg', 'real_01001.jpg', 'real_00673.jpg', 'real_00307.jpg', 'real_00672.jpg', 'real_01079.jpg', 'real_01068.jpg', 'real_00129.jpg', 'real_00977.jpg', 'real_00931.jpg', 'real_00031.jpg', 'real_00423.jpg', 'real_00383.jpg', 'real_00011.jpg', 'real_00691.jpg', 'real_00276.jpg', 'real_00551.jpg', 'real_00537.jpg', 'real_00473.jpg', 'real_00552.jpg', 'real_00698.jpg', 'real_00256.jpg', 'real_01049.jpg', 'real_00626.jpg', 'real_00017.jpg', 'real_00024.jpg', 'real_00431.jpg', 'real_00359.jpg', 'real_00615.jpg', 'real_00197.jpg', 'real_00159.jpg', 'real_00859.jpg', 'real_01010.jpg', 'real_00835.jpg', 'real_00885.jpg', 'real_00213.jpg', 'real_01028.jpg', 'real_00185.jpg', 'real_00877.jpg', 'real_00062.jpg', 'real_00631.jpg', 'real_00649.jpg', 'real_00304.jpg', 'real_01015.jpg', 'real_00075.jpg', 'real_00371.jpg', 'real_00930.jpg', 'real_00324.jpg', 'real_00526.jpg', 'real_00311.jpg', 'real_00688.jpg', 'real_00284.jpg', 'real_00103.jpg', 'real_01054.jpg', 'real_00200.jpg', 'real_00033.jpg', 'real_00777.jpg', 'real_00484.jpg', 'real_00904.jpg', 'real_00022.jpg', 'real_00452.jpg', 'real_00072.jpg', 'real_00710.jpg', 'real_00902.jpg', 'real_00095.jpg', 'real_00401.jpg', 'real_00906.jpg', 'real_00208.jpg', 'real_00082.jpg', 'real_00867.jpg', 'real_00903.jpg', 'real_00465.jpg', 'real_00666.jpg', 'real_00009.jpg', 'real_00644.jpg', 'real_00524.jpg', 'real_00707.jpg', 'real_00997.jpg', 'real_00595.jpg', 'real_01073.jpg', 'real_01034.jpg', 'real_00542.jpg', 'real_00967.jpg', 'real_00651.jpg', 'real_00316.jpg', 'real_00023.jpg', 'real_00593.jpg', 'real_00892.jpg', 'real_00142.jpg', 'real_00730.jpg', 'real_00183.jpg', 'real_00568.jpg', 'real_00988.jpg', 'real_00571.jpg', 'real_00696.jpg', 'real_00078.jpg', 'real_00891.jpg', 'real_00026.jpg', 'real_00235.jpg', 'real_00056.jpg', 'real_00514.jpg', 'real_00217.jpg', 'real_00331.jpg', 'real_00119.jpg', 'real_00357.jpg', 'real_00261.jpg', 'real_00094.jpg', 'real_00823.jpg', 'real_01004.jpg', 'real_00052.jpg', 'real_00519.jpg', 'real_00858.jpg', 'real_00860.jpg', 'real_01013.jpg', 'real_00724.jpg', 'real_00053.jpg', 'real_00899.jpg', 'real_00034.jpg', 'real_01048.jpg', 'real_00327.jpg', 'real_00242.jpg', 'real_01030.jpg', 'real_00471.jpg', 'real_00067.jpg', 'real_00505.jpg', 'real_00469.jpg', 'real_00125.jpg', 'real_00850.jpg', 'real_00087.jpg', 'real_00141.jpg', 'real_01051.jpg', 'real_00996.jpg', 'real_00318.jpg', 'real_00347.jpg', 'real_00365.jpg', 'real_00634.jpg', 'real_00326.jpg', 'real_00682.jpg', 'real_00788.jpg', 'real_00086.jpg', 'real_00848.jpg', 'real_00191.jpg', 'real_00308.jpg', 'real_00509.jpg', 'real_00224.jpg', 'real_00468.jpg', 'real_00255.jpg', 'real_00655.jpg', 'real_00116.jpg', 'real_00705.jpg', 'real_00214.jpg', 'real_01080.jpg', 'real_00982.jpg', 'real_01067.jpg', 'real_00488.jpg', 'real_01081.jpg', 'real_00857.jpg', 'real_00059.jpg', 'real_00202.jpg', 'real_00420.jpg', 'real_00192.jpg', 'real_00577.jpg', 'real_00104.jpg', 'real_00855.jpg', 'real_00620.jpg', 'real_00153.jpg', 'real_00522.jpg', 'real_00935.jpg', 'real_00400.jpg', 'real_00399.jpg', 'real_00328.jpg', 'real_00722.jpg', 'real_00084.jpg', 'real_00800.jpg', 'real_00984.jpg', 'real_00934.jpg', 'real_00447.jpg', 'real_00562.jpg', 'real_00132.jpg', 'real_00051.jpg', 'real_00761.jpg', 'real_00184.jpg', 'real_00512.jpg', 'real_00621.jpg', 'real_00004.jpg', 'real_00849.jpg', 'real_00976.jpg', 'real_00583.jpg', 'real_00385.jpg', 'real_00740.jpg', 'real_00536.jpg', 'real_00507.jpg', 'real_01045.jpg', 'real_00837.jpg', 'real_00864.jpg', 'real_00395.jpg', 'real_00981.jpg', 'real_00037.jpg', 'real_00285.jpg', 'real_01063.jpg', 'real_00188.jpg', 'real_00265.jpg', 'real_00883.jpg', 'real_00563.jpg', 'real_00240.jpg', 'real_00362.jpg', 'real_00165.jpg', 'real_00297.jpg', 'real_00047.jpg', 'real_00815.jpg', 'real_00887.jpg', 'real_00889.jpg', 'real_00913.jpg', 'real_00355.jpg', 'real_00641.jpg', 'real_00908.jpg', 'real_00872.jpg', 'real_00637.jpg', 'real_00481.jpg', 'real_00826.jpg', 'real_00919.jpg', 'real_00813.jpg', 'real_00166.jpg', 'real_00999.jpg', 'real_00950.jpg', 'real_00897.jpg', 'real_00766.jpg', 'real_00625.jpg', 'real_00414.jpg', 'real_00623.jpg', 'real_00914.jpg', 'real_00436.jpg', 'real_00042.jpg', 'real_01018.jpg', 'real_00178.jpg', 'real_00916.jpg', 'real_00653.jpg', 'real_00918.jpg', 'real_00204.jpg', 'real_00495.jpg', 'real_00232.jpg', 'real_00419.jpg', 'real_00875.jpg', 'real_00364.jpg', 'real_00963.jpg', 'real_00575.jpg', 'real_00966.jpg', 'real_00544.jpg', 'real_00658.jpg', 'real_00814.jpg', 'real_00611.jpg', 'real_00633.jpg', 'real_00079.jpg', 'real_00226.jpg', 'real_00131.jpg', 'real_01022.jpg', 'real_00746.jpg', 'real_00381.jpg', 'real_00941.jpg', 'real_00728.jpg', 'real_00819.jpg', 'real_00886.jpg', 'real_00765.jpg', 'real_00614.jpg', 'real_00663.jpg', 'real_00790.jpg', 'real_00645.jpg', 'real_00462.jpg', 'real_00264.jpg', 'real_00490.jpg', 'real_00535.jpg', 'real_01057.jpg', 'real_00207.jpg', 'real_00396.jpg', 'real_00589.jpg', 'real_00924.jpg', 'real_00279.jpg', 'real_00136.jpg', 'real_00721.jpg', 'real_00463.jpg', 'real_00329.jpg', 'real_00227.jpg', 'real_00531.jpg', 'real_00376.jpg', 'real_00050.jpg', 'real_00602.jpg', 'real_00288.jpg', 'real_00833.jpg', 'real_00932.jpg', 'real_00737.jpg', 'real_00955.jpg', 'real_00879.jpg', 'real_00121.jpg', 'real_00135.jpg', 'real_00186.jpg', 'real_00958.jpg', 'real_00069.jpg', 'real_00553.jpg', 'real_00089.jpg', 'real_00665.jpg', 'real_00661.jpg', 'real_00557.jpg', 'real_00143.jpg', 'real_00246.jpg', 'real_00807.jpg', 'real_00481(1).jpg', 'real_00501.jpg', 'real_00798.jpg', 'real_00290.jpg', 'real_00370.jpg', 'real_01077.jpg', 'real_00262.jpg', 'real_00334.jpg', 'real_00940.jpg', 'real_00613.jpg', 'real_00588.jpg', 'real_00212.jpg', 'real_00888.jpg']

2. Write a loop which will iterate through all the images in the ‘training_images’ folder and detect the faces present on all the images. [3 Marks ]¶

Hint: You can use ’haarcascade_frontalface_default.xml’ from internet to detect faces which is available open source.

In [ ]:
#Iterate through all images in the folder and predict the face area
#use  'haarcascade_frontalface_default.xml' from internet to detect faces which is available open source.
face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')

pbar=tqdm(training_images)
for img in pbar:
    pbar.set_description(img)
    original_image = cv2.imread(f'{training_images_extract_folder}/training_images/{img}')

    #use face_cascade to detect face area

    #Covert to grayscale and equalize the histogram of the grayscale image.
    gray = cv2.cvtColor(original_image, cv2.COLOR_BGR2GRAY)
    gray = cv2.equalizeHist(gray)

    faces = face_cascade.detectMultiScale(gray)
    pbar.set_postfix({'num_faces': len(faces)})
real_00888.jpg: 100%|██████████| 1091/1091 [01:08<00:00, 15.85it/s, num_faces=0]

3. From the same loop above, extract metadata of the faces and write into a DataFrame. [3 Marks]¶

Sample output:

x y w h Total_Faces Image_Name
0 94 144 390 390 1 real_00251.jpg
1 65 87 459 459 1 real_00537.jpg
In [ ]:
#Extract the face area from the original image as above and save it in a Dataframe with columns 'x', 'y', 'w', 'h', 'Total_Faces' and 'Image_Name'
face_data = []
for img in tqdm(training_images):
    original_image = cv2.imread(f'{training_images_extract_folder}/training_images/{img}')
    gray = cv2.cvtColor(original_image, cv2.COLOR_BGR2GRAY)
    gray = cv2.equalizeHist(gray)
    faces = face_cascade.detectMultiScale(gray)
    total_faces = len(faces)
    if(total_faces==0):
        face_data.append([0,0,0,0,0, img]) #No faces found
    for (x, y, w, h) in faces:
        face_data.append([x,y,w,h,total_faces, img])
face_data = pd.DataFrame(face_data, columns=['x','y','w','h','Total_Faces', 'Image_Name'])
print()
print(face_data)
100%|██████████| 1091/1091 [01:06<00:00, 16.41it/s]
        x    y    w    h  Total_Faces      Image_Name
0      66  161  403  403            1  real_00176.jpg
1      45  148  428  428            1  real_00379.jpg
2       9   35  537  537            1  real_00686.jpg
3     167  128  385  385            1  real_00566.jpg
4       0    0    0    0            0  real_00319.jpg
...   ...  ...  ...  ...          ...             ...
1228    0    0    0    0            0  real_00613.jpg
1229  147  156  394  394            1  real_00588.jpg
1230   26  212   65   65            2  real_00212.jpg
1231   15   60  484  484            2  real_00212.jpg
1232    0    0    0    0            0  real_00888.jpg

[1233 rows x 6 columns]

4. Save the output Dataframe in .csv format. [2 Marks]¶

In [ ]:
#Save the dataframe to a csv file
if RunningInCOLAB:
    face_data_csv = '/content/drive/MyDrive/Tuhin/AI-ML Course - UT Austin/Projects/8-Computer Vision/face_data.csv' # Google Drive path
else:
    face_data_csv = 'face_data.csv'

face_data.to_csv(face_data_csv, index=False)

del face_data

PART C - 30 Marks¶

  • DOMAIN: Face Recognition

  • CONTEXT: Company X intends to build a face identification model to recognise human faces.

  • DATA DESCRIPTION: The dataset comprises of images and its mask where there is a human face.

  • PROJECT OBJECTIVE: Face Aligned Face Dataset from Pinterest. This dataset contains 10,770 images for 100 people. All images are taken from 'Pinterest' and aligned using dlib library. Some data samples:

No description has been provided for this image

Steps and tasks: [ Total Score: 30 Marks]¶

1. Unzip, read and Load data(‘PINS.zip’) into session. [2 Marks]¶

In [ ]:
#Unzip the file PINS.zip into session
if RunningInCOLAB:
    pins_zip = '/content/drive/MyDrive/Tuhin/AI-ML Course - UT Austin/Projects/8-Computer Vision/PINS.zip' # Google Drive path
else:
    pins_zip = 'PINS.zip'

#We will extract to local folder, loading image from google drive is time consuming
pins_extract_folder = 'PINS'

with zipfile.ZipFile(pins_zip, 'r') as zip_ref:
    zip_ref.extractall(pins_extract_folder)

2. Write function to create metadata of the image. [4 Marks]¶

Hint: Metadata means derived information from the available data which can be useful for particular problem statement.

In [ ]:
#Write function to create metatdata for the images
#[This is copied from the Hint notebook shared - 'Hint - CV - 2_Part 2_.ipynb']

#Class to create metadata for the image file
# base - base directory of the dataset
# name - identity name
# file - image file name
# person_name : name of the person, after pins_ in the folder name
class IdentityMetadata():
    def __init__(self, base, name, file):
        # print(base, name, file)
        # dataset base directory
        self.base = base
        # identity name
        self.name = name
        # image file name
        self.file = file
        #Person name : this is done by removing prefix pins_ from the file name
        self.person_name = name.split('_')[1]

    def __repr__(self):
        return self.image_path()

    def image_path(self):
        return os.path.join(self.base, self.name, self.file)

3. Write a loop to iterate through each and every image and create metadata for all the images. [4 Marks]¶

In [ ]:
#Function to load metadata
#This loops through the dataset and creates metadata for each image
#path - path of the dataset
def load_metadata(path):
    metadata = []
    for i in os.listdir(path):
        for f in os.listdir(os.path.join(path, i)):
            # Check file extension. Allow only jpg/jpeg' files.
            ext = os.path.splitext(f)[1]
            if ext == '.jpg' or ext == '.jpeg':
                metadata.append(IdentityMetadata(path, i, f))
    return np.array(metadata)

#Now load the metadata for the images from 'PINS/PINS' folder (the PINS.zip folder unzipped into 'PINS' folder, which contents another PINS folder with images)
pins_folder = f'{pins_extract_folder}/PINS'

metadata = load_metadata(pins_folder)

#This function reads the image data from the given image path
def load_image(path):
    img = cv2.imread(path, 1)
    # OpenCV loads images with color channels
    # in BGR order. So we need to reverse them
    return img[...,::-1]
In [ ]:
# Print the first image to check
load_image(metadata[0].image_path())
Out[ ]:
array([[[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        ...,
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]],

       [[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        ...,
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]],

       [[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        ...,
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]],

       ...,

       [[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        ...,
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]],

       [[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        ...,
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]],

       [[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        ...,
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]]], dtype=uint8)

4. Generate Embeddings vectors on the each face in the dataset. [4 Marks]¶

Hint: Use ‘vgg_face_weights.h5’

In [ ]:
#Define the VGG model
def vgg_face():
    model = tf.keras.models.Sequential()
    model.add(tf.keras.layers.ZeroPadding2D((1,1),input_shape=(224,224, 3)))
    model.add(tf.keras.layers.Convolution2D(64, (3, 3), activation='relu'))
    model.add(tf.keras.layers.ZeroPadding2D((1,1)))
    model.add(tf.keras.layers.Convolution2D(64, (3, 3), activation='relu'))
    model.add(tf.keras.layers.MaxPooling2D((2,2), strides=(2,2)))

    model.add(tf.keras.layers.ZeroPadding2D((1,1)))
    model.add(tf.keras.layers.Convolution2D(128, (3, 3), activation='relu'))
    model.add(tf.keras.layers.ZeroPadding2D((1,1)))
    model.add(tf.keras.layers.Convolution2D(128, (3, 3), activation='relu'))
    model.add(tf.keras.layers.MaxPooling2D((2,2), strides=(2,2)))

    model.add(tf.keras.layers.ZeroPadding2D((1,1)))
    model.add(tf.keras.layers.Convolution2D(256, (3, 3), activation='relu'))
    model.add(tf.keras.layers.ZeroPadding2D((1,1)))
    model.add(tf.keras.layers.Convolution2D(256, (3, 3), activation='relu'))
    model.add(tf.keras.layers.ZeroPadding2D((1,1)))
    model.add(tf.keras.layers.Convolution2D(256, (3, 3), activation='relu'))
    model.add(tf.keras.layers.MaxPooling2D((2,2), strides=(2,2)))

    model.add(tf.keras.layers.ZeroPadding2D((1,1)))
    model.add(tf.keras.layers.Convolution2D(512, (3, 3), activation='relu'))
    model.add(tf.keras.layers.ZeroPadding2D((1,1)))
    model.add(tf.keras.layers.Convolution2D(512, (3, 3), activation='relu'))
    model.add(tf.keras.layers.ZeroPadding2D((1,1)))
    model.add(tf.keras.layers.Convolution2D(512, (3, 3), activation='relu'))
    model.add(tf.keras.layers.MaxPooling2D((2,2), strides=(2,2)))

    model.add(tf.keras.layers.ZeroPadding2D((1,1)))
    model.add(tf.keras.layers.Convolution2D(512, (3, 3), activation='relu'))
    model.add(tf.keras.layers.ZeroPadding2D((1,1)))
    model.add(tf.keras.layers.Convolution2D(512, (3, 3), activation='relu'))
    model.add(tf.keras.layers.ZeroPadding2D((1,1)))
    model.add(tf.keras.layers.Convolution2D(512, (3, 3), activation='relu'))
    model.add(tf.keras.layers.MaxPooling2D((2,2), strides=(2,2)))

    model.add(tf.keras.layers.Convolution2D(4096, (7, 7), activation='relu'))
    model.add(tf.keras.layers.Dropout(0.5))
    model.add(tf.keras.layers.Convolution2D(4096, (1, 1), activation='relu'))
    model.add(tf.keras.layers.Dropout(0.5))
    model.add(tf.keras.layers.Convolution2D(2622, (1, 1)))
    model.add(tf.keras.layers.Flatten())
    model.add(tf.keras.layers.Activation('softmax'))
    return model
In [ ]:
#Now create the VGG model and load the weights
if RunningInCOLAB:
  weight_file = '/content/drive/MyDrive/Tuhin/AI-ML Course - UT Austin/Projects/8-Computer Vision/vgg_face_weights.h5' # Google Drive path
else:
  weight_file = 'vgg_face_weights.h5'
model = vgg_face()
model.load_weights(weight_file)
In [ ]:
#Create the VGG Face Descriptor model - this model will be used to extract the features from the images (embeddings)
#Therfore we will remove the last layer from the VGG model and use the output of the second last layer as the output of the new model
vgg_face_descriptor = tf.keras.models.Model(inputs=model.layers[0].input, outputs=model.layers[-2].output)
In [ ]:
# Getting embedding vector for first image in the metadata using the pre-trained model

img_path = metadata[0].image_path()
img = load_image(img_path)

# Normalising pixel values from [0-255] to [0-1]: scale RGB values to interval [0,1]
img = (img / 255.).astype(np.float32)

img = cv2.resize(img, dsize = (224,224))
print(img.shape)

# Obtaining embedding vector for an image
# Getting the embedding vector for the above image using vgg_face_descriptor model and printing the shape

embedding_vector = vgg_face_descriptor.predict(np.expand_dims(img, axis=0))[0]
print(embedding_vector.shape)
(224, 224, 3)
1/1 [==============================] - 1s 744ms/step
(2622,)
In [ ]:
#Get the embedding vectors for all the images in the metadata

#initialize the embeddings : embeding size per image will be 2622 (as seen from the output above)
embeddings = np.zeros((metadata.shape[0], 2622))

#We will generate embeddings using batch of images
batch_size=1000 #@param {type:"integer"}

total_batches=len(embeddings)//batch_size + (1 if (len(embeddings)%batch_size > 0) else 0)
for idx in range(0, len(embeddings), batch_size):

  #Batch image indexe - start and end values
  start = idx
  end   = idx+batch_size if idx+batch_size < len(embeddings) else len(embeddings)

  #create the input array of batch size - each input tensor of size [224,224,3]
  inputs = np.zeros([end-start,224,224,3])

  #get the metadata at for the batch 'start' to 'end'
  #and populate the input vector array
  pbar=tqdm(metadata[start:end])
  pbar.set_description(f'Batch {idx//batch_size+1} of {total_batches}')

  for i,m in enumerate(pbar):
    img = load_image(m.image_path())
    try:
        img = load_image(m.image_path())
        # scale RGB values to interval [0,1]
        img = cv2.resize(img, dsize = (224,224))
        img = (img / 255.).astype(np.float32)
        #Set the input tensor for the image
        inputs[i] = np.expand_dims(img, axis=0)[0]
    except Exception as e:
        print(str(e))
        print(i,m)

  #Now set the embeddings for the batch
  embeddings[start:end] = vgg_face_descriptor.predict(inputs)

  del inputs #delete to release memory

  print() #print a newline for the pbar
Batch 1 of 11: 100%|██████████| 1000/1000 [00:04<00:00, 231.91it/s]
32/32 [==============================] - 3s 49ms/step

Batch 2 of 11: 100%|██████████| 1000/1000 [00:04<00:00, 223.89it/s]
32/32 [==============================] - 1s 30ms/step

Batch 3 of 11: 100%|██████████| 1000/1000 [00:04<00:00, 227.91it/s]
32/32 [==============================] - 1s 30ms/step

Batch 4 of 11: 100%|██████████| 1000/1000 [00:04<00:00, 224.20it/s]
32/32 [==============================] - 1s 30ms/step

Batch 5 of 11: 100%|██████████| 1000/1000 [00:04<00:00, 226.50it/s]
32/32 [==============================] - 1s 29ms/step

Batch 6 of 11: 100%|██████████| 1000/1000 [00:04<00:00, 226.99it/s]
32/32 [==============================] - 1s 30ms/step

Batch 7 of 11: 100%|██████████| 1000/1000 [00:04<00:00, 227.79it/s]
32/32 [==============================] - 1s 29ms/step

Batch 8 of 11: 100%|██████████| 1000/1000 [00:04<00:00, 225.77it/s]
32/32 [==============================] - 1s 29ms/step

Batch 9 of 11: 100%|██████████| 1000/1000 [00:04<00:00, 227.71it/s]
32/32 [==============================] - 1s 29ms/step

Batch 10 of 11: 100%|██████████| 1000/1000 [00:04<00:00, 229.21it/s]
32/32 [==============================] - 1s 29ms/step

Batch 11 of 11: 100%|██████████| 770/770 [00:03<00:00, 226.93it/s]
25/25 [==============================] - 1s 48ms/step

5. Build distance metrics for identifying the distance between two similar and dissimilar images. [4 Marks]¶

In [ ]:
def distance(emb1, emb2):
    return np.sum(np.square(emb1 - emb2))
In [ ]:
def show_pair(idx1, idx2):
    plt.figure(figsize=(8,3))
    plt.suptitle(f'Distance = {distance(embeddings[idx1], embeddings[idx2]):.2f}')
    plt.subplot(121)
    plt.imshow(load_image(metadata[idx1].image_path()))
    plt.title(metadata[idx1].person_name)
    plt.subplot(122)
    plt.imshow(load_image(metadata[idx2].image_path()));
    plt.title(metadata[idx2].person_name)
show_pair(102, 103)
show_pair(102, 480)
No description has been provided for this image
No description has been provided for this image

6. Use PCA for dimensionality reduction. [2 Marks]¶

In [ ]:
#Create train and test data

#All indexes which is divisible by 9 will be test images, rest are train images. So, 90% images are for training, 10% is for testing
train_idx = np.arange(metadata.shape[0]) % 9 != 0
test_idx = np.arange(metadata.shape[0]) % 9 == 0

# Create X_train and X_test using the above indices
X_train = embeddings[train_idx]
X_test = embeddings[test_idx]

#For each image 'person_name' from the metadata is the target label
targets = np.array([m.person_name for m in metadata])

#Again store the y_train and y_test as per the same indices as above
y_train = targets[train_idx]
y_test = targets[test_idx]
In [ ]:
# Encode the target identities
encoder = LabelEncoder()
y_train = encoder.fit_transform(y_train)
y_test = encoder.transform(y_test)
In [ ]:
#Scale the features using StandardScaler

# Standarize features
sc = StandardScaler()
X_train_sc = sc.fit_transform(X_train)
X_test_sc = sc.transform(X_test)
In [ ]:
# Reduce feature dimensions using Principal Component Analysis

# Covariance matrix
cov_matrix = np.cov(X_train_sc.T)

# Eigen values and vector
eig_vals, eig_vecs = np.linalg.eig(cov_matrix)

# Cumulative variance explained
tot = sum(eig_vals)
var_exp = [(i /tot) * 100 for i in sorted(eig_vals, reverse = True)]
cum_var_exp = np.cumsum(var_exp)

print('Cumulative Variance Explained', cum_var_exp)
Cumulative Variance Explained [ 13.5656337   18.93059331  22.93655973 ...  99.99999983  99.99999999
 100.        ]
In [ ]:
# Get index where cumulative variance explained is > threshold
thres = 95
res = list(filter(lambda i: i > thres, cum_var_exp))[0]
index = (cum_var_exp.tolist().index(res))
print(f'Index of element just greater than {thres}: {str(index)}')
Index of element just greater than 95: 346
In [ ]:
# Ploting
plt.figure(figsize = (15 , 7.2))
plt.bar(range(1, eig_vals.size + 1), var_exp, alpha = 0.5, align = 'center', label = 'Individual explained variance')
plt.step(range(1, eig_vals.size + 1), cum_var_exp, where = 'mid', label = 'Cumulative explained variance')
plt.axhline(y = thres, color = 'r', linestyle = '--')
plt.axvline(x = index, color = 'r', linestyle = '--')
plt.ylabel('Explained Variance Ratio')
plt.xlabel('Principal Components')
plt.legend(loc = 'best')
plt.tight_layout()
plt.show()
No description has been provided for this image
In [ ]:
# Reducing the dimensions
pca = PCA(n_components = index, random_state = 20, svd_solver = 'full', whiten = True)
pca.fit(X_train_sc)
X_train_pca = pca.transform(X_train_sc)
X_test_pca = pca.transform(X_test_sc)
display(X_train_pca.shape, X_test_pca.shape)
(9573, 346)
(1197, 346)

7. Build an SVM classifier in order to map each image to its right person. [4 Marks]¶

In [ ]:
#Try to get the best parameters for SVC using grid search

params_grid = [{'kernel': ['rbf'], 'gamma': [1e-2, 1e-3, 1e-4], 'C': [1, 10, 100, 1000], 'class_weight': ['balanced', None]}]

svc_search = GridSearchCV(SVC(random_state = 20, verbose=True), params_grid, cv = 3, scoring = 'f1_macro', verbose=4)
svc_search.fit(X_train_pca, y_train)

print('Best estimator found by grid search:')
print(svc_search.best_estimator_)
Fitting 3 folds for each of 24 candidates, totalling 72 fits
[LibSVM][CV 1/3] END C=1, class_weight=balanced, gamma=0.01, kernel=rbf;, score=0.688 total time=  21.9s
[LibSVM][CV 2/3] END C=1, class_weight=balanced, gamma=0.01, kernel=rbf;, score=0.671 total time=  21.3s
[LibSVM][CV 3/3] END C=1, class_weight=balanced, gamma=0.01, kernel=rbf;, score=0.660 total time=  21.6s
[LibSVM][CV 1/3] END C=1, class_weight=balanced, gamma=0.001, kernel=rbf;, score=0.960 total time=  15.9s
[LibSVM][CV 2/3] END C=1, class_weight=balanced, gamma=0.001, kernel=rbf;, score=0.960 total time=  15.7s
[LibSVM][CV 3/3] END C=1, class_weight=balanced, gamma=0.001, kernel=rbf;, score=0.961 total time=  15.7s
[LibSVM][CV 1/3] END C=1, class_weight=balanced, gamma=0.0001, kernel=rbf;, score=0.941 total time=  20.9s
[LibSVM][CV 2/3] END C=1, class_weight=balanced, gamma=0.0001, kernel=rbf;, score=0.940 total time=  21.0s
[LibSVM][CV 3/3] END C=1, class_weight=balanced, gamma=0.0001, kernel=rbf;, score=0.935 total time=  20.6s
[LibSVM][CV 1/3] END C=1, class_weight=None, gamma=0.01, kernel=rbf;, score=0.668 total time=  21.6s
[LibSVM][CV 2/3] END C=1, class_weight=None, gamma=0.01, kernel=rbf;, score=0.652 total time=  22.2s
[LibSVM][CV 3/3] END C=1, class_weight=None, gamma=0.01, kernel=rbf;, score=0.643 total time=  21.8s
[LibSVM][CV 1/3] END C=1, class_weight=None, gamma=0.001, kernel=rbf;, score=0.958 total time=  15.6s
[LibSVM][CV 2/3] END C=1, class_weight=None, gamma=0.001, kernel=rbf;, score=0.958 total time=  15.9s
[LibSVM][CV 3/3] END C=1, class_weight=None, gamma=0.001, kernel=rbf;, score=0.957 total time=  15.3s
[LibSVM][CV 1/3] END C=1, class_weight=None, gamma=0.0001, kernel=rbf;, score=0.696 total time=  19.0s
[LibSVM][CV 2/3] END C=1, class_weight=None, gamma=0.0001, kernel=rbf;, score=0.684 total time=  19.6s
[LibSVM][CV 3/3] END C=1, class_weight=None, gamma=0.0001, kernel=rbf;, score=0.677 total time=  19.7s
[LibSVM][CV 1/3] END C=10, class_weight=balanced, gamma=0.01, kernel=rbf;, score=0.706 total time=  21.6s
[LibSVM][CV 2/3] END C=10, class_weight=balanced, gamma=0.01, kernel=rbf;, score=0.692 total time=  21.8s
[LibSVM][CV 3/3] END C=10, class_weight=balanced, gamma=0.01, kernel=rbf;, score=0.687 total time=  21.3s
[LibSVM][CV 1/3] END C=10, class_weight=balanced, gamma=0.001, kernel=rbf;, score=0.959 total time=  15.6s
[LibSVM][CV 2/3] END C=10, class_weight=balanced, gamma=0.001, kernel=rbf;, score=0.960 total time=  15.8s
[LibSVM][CV 3/3] END C=10, class_weight=balanced, gamma=0.001, kernel=rbf;, score=0.959 total time=  15.7s
[LibSVM][CV 1/3] END C=10, class_weight=balanced, gamma=0.0001, kernel=rbf;, score=0.961 total time=  13.4s
[LibSVM][CV 2/3] END C=10, class_weight=balanced, gamma=0.0001, kernel=rbf;, score=0.960 total time=  13.5s
[LibSVM][CV 3/3] END C=10, class_weight=balanced, gamma=0.0001, kernel=rbf;, score=0.961 total time=  13.4s
[LibSVM][CV 1/3] END C=10, class_weight=None, gamma=0.01, kernel=rbf;, score=0.706 total time=  21.5s
[LibSVM][CV 2/3] END C=10, class_weight=None, gamma=0.01, kernel=rbf;, score=0.692 total time=  21.5s
[LibSVM][CV 3/3] END C=10, class_weight=None, gamma=0.01, kernel=rbf;, score=0.687 total time=  21.8s
[LibSVM][CV 1/3] END C=10, class_weight=None, gamma=0.001, kernel=rbf;, score=0.959 total time=  15.5s
[LibSVM][CV 2/3] END C=10, class_weight=None, gamma=0.001, kernel=rbf;, score=0.960 total time=  15.3s
[LibSVM][CV 3/3] END C=10, class_weight=None, gamma=0.001, kernel=rbf;, score=0.960 total time=  15.6s
[LibSVM][CV 1/3] END C=10, class_weight=None, gamma=0.0001, kernel=rbf;, score=0.960 total time=  13.2s
[LibSVM][CV 2/3] END C=10, class_weight=None, gamma=0.0001, kernel=rbf;, score=0.959 total time=  13.4s
[LibSVM][CV 3/3] END C=10, class_weight=None, gamma=0.0001, kernel=rbf;, score=0.959 total time=  13.0s
[LibSVM][CV 1/3] END C=100, class_weight=balanced, gamma=0.01, kernel=rbf;, score=0.706 total time=  21.8s
[LibSVM][CV 2/3] END C=100, class_weight=balanced, gamma=0.01, kernel=rbf;, score=0.692 total time=  21.1s
[LibSVM][CV 3/3] END C=100, class_weight=balanced, gamma=0.01, kernel=rbf;, score=0.687 total time=  21.5s
[LibSVM][CV 1/3] END C=100, class_weight=balanced, gamma=0.001, kernel=rbf;, score=0.959 total time=  15.9s
[LibSVM][CV 2/3] END C=100, class_weight=balanced, gamma=0.001, kernel=rbf;, score=0.960 total time=  14.9s
[LibSVM][CV 3/3] END C=100, class_weight=balanced, gamma=0.001, kernel=rbf;, score=0.959 total time=  15.6s
[LibSVM][CV 1/3] END C=100, class_weight=balanced, gamma=0.0001, kernel=rbf;, score=0.954 total time=  12.9s
[LibSVM][CV 2/3] END C=100, class_weight=balanced, gamma=0.0001, kernel=rbf;, score=0.957 total time=  12.6s
[LibSVM][CV 3/3] END C=100, class_weight=balanced, gamma=0.0001, kernel=rbf;, score=0.957 total time=  12.4s
[LibSVM][CV 1/3] END C=100, class_weight=None, gamma=0.01, kernel=rbf;, score=0.706 total time=  22.0s
[LibSVM][CV 2/3] END C=100, class_weight=None, gamma=0.01, kernel=rbf;, score=0.692 total time=  22.1s
[LibSVM][CV 3/3] END C=100, class_weight=None, gamma=0.01, kernel=rbf;, score=0.687 total time=  22.0s
[LibSVM][CV 1/3] END C=100, class_weight=None, gamma=0.001, kernel=rbf;, score=0.959 total time=  15.9s
[LibSVM][CV 2/3] END C=100, class_weight=None, gamma=0.001, kernel=rbf;, score=0.960 total time=  15.8s
[LibSVM][CV 3/3] END C=100, class_weight=None, gamma=0.001, kernel=rbf;, score=0.959 total time=  15.5s
[LibSVM][CV 1/3] END C=100, class_weight=None, gamma=0.0001, kernel=rbf;, score=0.954 total time=  12.3s
[LibSVM][CV 2/3] END C=100, class_weight=None, gamma=0.0001, kernel=rbf;, score=0.957 total time=  12.6s
[LibSVM][CV 3/3] END C=100, class_weight=None, gamma=0.0001, kernel=rbf;, score=0.958 total time=  12.4s
[LibSVM][CV 1/3] END C=1000, class_weight=balanced, gamma=0.01, kernel=rbf;, score=0.706 total time=  22.1s
[LibSVM][CV 2/3] END C=1000, class_weight=balanced, gamma=0.01, kernel=rbf;, score=0.692 total time=  21.6s
[LibSVM][CV 3/3] END C=1000, class_weight=balanced, gamma=0.01, kernel=rbf;, score=0.687 total time=  21.4s
[LibSVM][CV 1/3] END C=1000, class_weight=balanced, gamma=0.001, kernel=rbf;, score=0.959 total time=  15.8s
[LibSVM][CV 2/3] END C=1000, class_weight=balanced, gamma=0.001, kernel=rbf;, score=0.960 total time=  15.8s
[LibSVM][CV 3/3] END C=1000, class_weight=balanced, gamma=0.001, kernel=rbf;, score=0.959 total time=  15.1s
[LibSVM][CV 1/3] END C=1000, class_weight=balanced, gamma=0.0001, kernel=rbf;, score=0.954 total time=  12.4s
[LibSVM][CV 2/3] END C=1000, class_weight=balanced, gamma=0.0001, kernel=rbf;, score=0.957 total time=  12.3s
[LibSVM][CV 3/3] END C=1000, class_weight=balanced, gamma=0.0001, kernel=rbf;, score=0.956 total time=  12.7s
[LibSVM][CV 1/3] END C=1000, class_weight=None, gamma=0.01, kernel=rbf;, score=0.706 total time=  21.4s
[LibSVM][CV 2/3] END C=1000, class_weight=None, gamma=0.01, kernel=rbf;, score=0.692 total time=  21.1s
[LibSVM][CV 3/3] END C=1000, class_weight=None, gamma=0.01, kernel=rbf;, score=0.687 total time=  21.8s
[LibSVM][CV 1/3] END C=1000, class_weight=None, gamma=0.001, kernel=rbf;, score=0.959 total time=  16.4s
[LibSVM][CV 2/3] END C=1000, class_weight=None, gamma=0.001, kernel=rbf;, score=0.960 total time=  15.9s
[LibSVM][CV 3/3] END C=1000, class_weight=None, gamma=0.001, kernel=rbf;, score=0.959 total time=  15.5s
[LibSVM][CV 1/3] END C=1000, class_weight=None, gamma=0.0001, kernel=rbf;, score=0.954 total time=  12.3s
[LibSVM][CV 2/3] END C=1000, class_weight=None, gamma=0.0001, kernel=rbf;, score=0.957 total time=  12.6s
[LibSVM][CV 3/3] END C=1000, class_weight=None, gamma=0.0001, kernel=rbf;, score=0.956 total time=  12.8s
[LibSVM]Best estimator found by grid search:
SVC(C=10, class_weight='balanced', gamma=0.0001, random_state=20, verbose=True)
In [ ]:
#Use the best parameter as printed above

clf = SVC(C = 10, gamma = 0.0001, kernel = 'rbf', class_weight = 'balanced', random_state = 20)
clf.fit(X_train_pca, y_train)
print('SVC accuracy for train set: {0:.3f}'.format(clf.score(X_train_pca, y_train)))
print('SVC accuracy for test  set: {0:.3f}'.format(clf.score(X_test_pca, y_test)))
SVC accuracy for train set: 0.994
SVC accuracy for test  set: 0.962

8. Import and display the the test images. [2 Marks]¶

Hint: ‘Benedict Cumberbatch9.jpg’ and ‘Dwayne Johnson4.jpg’ are the test images.

In [ ]:
# Suppressing LabelEncoder warning
warnings.filterwarnings('ignore')

#Create a function to display an image
def display_image(image_path, title):
  example_image = load_image(image_path)
  plt.imshow(example_image)
  plt.title(title)

if RunningInCOLAB:
    Benedict_Cumberbatch9 = '/content/drive/MyDrive/Tuhin/AI-ML Course - UT Austin/Projects/8-Computer Vision/Benedict Cumberbatch9.jpg' # Google Drive path
    Dwayne_Johnson4 = '/content/drive/MyDrive/Tuhin/AI-ML Course - UT Austin/Projects/8-Computer Vision/Dwayne Johnson4.jpg' # Google Drive path
else:
    Benedict_Cumberbatch9 = 'Benedict Cumberbatch9.jpg' #Local path
    Dwayne_Johnson4 = 'Dwayne Johnson4.jpg'  #Local path

display_image(Benedict_Cumberbatch9, "Benedict Cumberbatch")
No description has been provided for this image
In [ ]:
display_image(Dwayne_Johnson4, "Dwayne Johnson")
No description has been provided for this image

9. Use the trained SVM model to predict the face on both test images. [4 Marks]¶

In [ ]:
def identify(image_path):
  test_image = load_image(image_path)
  # scale RGB values to interval [0,1]
  img = cv2.resize(test_image, dsize = (224,224))
  img = (img / 255.).astype(np.float32)
  #Set the input tensor for the image
  input = np.expand_dims(img, axis=0)
  test_image_embeddings = vgg_face_descriptor.predict(input)[0]

  #Scale the features using our standard scaler defined earlier
  X_test_new_sc = sc.transform([test_image_embeddings])
  #Reduce feature dimensions using Principal Component Analysis defined earlier
  X_test_new_pca = pca.transform(X_test_new_sc)

  #Predict the identity using the SVC model
  prediction = clf.predict([X_test_new_pca])
  #Get the identity name from the encoder (reverse transform)
  identity = encoder.inverse_transform(prediction)[0]

  #Display the image with the identified name
  plt.imshow(test_image)
  plt.title(f'Identified as {identity}')

identify(Benedict_Cumberbatch9)
1/1 [==============================] - 0s 24ms/step
No description has been provided for this image
In [ ]:
identify(Dwayne_Johnson4)
1/1 [==============================] - 0s 21ms/step
No description has been provided for this image